INDEPENDENT SETS IN HYPERGRAPHS 



JOZSEF BALOGH, ROBERT MORRIS, AND WOJCIECH SAMOTIJ 

' Abstract. Many important theorems and conjectures in combinatorics, such as the the- 

, orem of Szemeredi on arithmetic progressions and the Erdos-Stone Theorem in extremal 

' graph theory, can be phrased as statements about famihes of independent sets in certain 

5^ , uniform hypergraphs. In recent years, an important trend in the area has been to extend 

Qh' such classical results to the so-called 'sparse random setting'. This line of research has 

, recently culminated in the breakthroughs of Conlon and Gowers and of Schacht, who de- 

veloped general tools for solving problems of this type. Although these two papers solved 
, very similar sets of longstanding open problems, the methods used are very different from 

one another and have different strengths and weaknesses. 

In this paper, we provide a third, completely different approach to proving extremal 
and structural results in sparse random sets that also yields their natural 'counting' coun- 
U: terparts. We give a structural characterization of the independent sets in a large class of 

' uniform hypergraphs by showing that every independent set is almost contained in one of 

a small number of relatively sparse sets. We then derive many interesting results as fairly 
straightforward consequences of this abstract theorem. In particular, we prove the well- 
known conjecture of Kohayakawa, Luczak, and Rodl, a probabilistic embedding lemma for 
sparse graphs, for all 2-balanced graphs. We also give alternative proofs of many of the 
results of Conlon and Gowers and Schacht, such as sparse random versions of Szemeredi's 
^ ' theorem, the Erdos-Stone Theorem and the Erdos-Simonovits Stability Theorem, and ob- 

tain their natural 'counting' versions, which in some cases are considerably stronger. We 
Jy-^ ' also obtain new results, such as a sparse version of the Erdos-Frankl-Rodl Theorem on the 

, number of H-free graphs and, as a consequence of the KLR conjecture, we extend a re- 

sult of Rodl and Rucihski on Ramsey properties in sparse random graphs to the general, 
, non-symmetric setting. Similar results have been discovered independently by Saxton and 

fSJ ' Thomason. 
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1. Introduction 

A great many of the central questions in combinatorics fall into the following general 
framework: Given a finite set V and a collection H C V{V) of forbidden structures, what 
can be said about sets I ^ V that do not contain any member of "H? For example, the 
celebrated theorem of Szemeredi [62j states that if \^ = {1, . . . , n} and T-L is the collection of 
/c-term arithmetic progressions in {1, . . . , n}, then every set / that contains no member of H 
satisfies |/| = o{n). The archetypal problem studied in extremal graph theory, dating back 
to the work of Turan [6l] and Erdos and Stone [18], is the problem of characterizing such 
sets / when V is the edge set of the complete graph on n vertices and "H is the collection of 
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copies of some fixed graph H in Kn- In this setting, a great deal is known, not only about 
the maximum size of / that contains no member of H, but also what the largest such sets 
look like, how many such sets there are, and what the structure of a typical such set is. 

A collection H C V{V) as above is usually referred to as a hypergraph on the vertex 
set V and any set / C that contains no element {edge) of T-L is called an independent 
set. Therefore, one might say that a large part of extremal combinatorics is concerned with 
studying independent sets in various specific hypergraphs. We might add here that in many 
natural settings, such as the two mentioned above, the hypergraphs considered are uniform, 
that is, all edges of "H have the same size. 

Although it might at first seem somewhat artificial to study concrete questions in such 
abstract setting, the past few years have proved that taking such a general approach can be 
highly beneficial. The recently-proved general transference theorems of Conlon and Cow- 
ers [12] and Schacht which imply, among other things, sparse random analogues of 
the classical theorems of Szemeredi and of Erdos and Stone, were stated in the language 
of hypergraphs. Roughly speaking, these transference theorems say that if the edges of a 
hypergraph Ti are sufficiently uniformly distributed, then the independence number of H 
is 'well-behaved' with respect to taking subhypergraphs induced by (sufficiently dense) ran- 
dom subsets of the vertex set. More precisely, given p e [0, 1] and a finite set V, we shall 
write Vp to denote the p-random subset of V , that is, the random subset of V in which each 
element of V is included with probability p, independently of all other elements. We write 
a{l-L) and v{l-i) to denote the size of the largest independent set and the number of vertices 
in a hypergraph "H, respectively. The results of Conlon and Cowers [I2] and Schacht [58j 
imply, in particular, that if the distribution of the edges of some uniform hypergraph % is 
sufficiently 'balanced', then with probability tending to 1 as f ("H) — )■ oo, 

a{H[V{V)p]) ^pa(n) + o{pv{'H)), 

provided that p is sufficiently large. 

In this work, we give an approximate structural characterization of the family of all in- 
dependent sets in uniform hypergraphs whose edge distribution satisfies a certain natural 
boundedness condition. More precisely, we shall prove that the independent sets of each 
such hypergraph Ti exhibit a certain clustering phenomenon. Our main result. Theorem 12.21 
below, states that the family X('H) of independent sets in "H admits a partition into relatively 
few classes such that all members of each class are essentially contained in a single subset of 
VlH) that is almost independent, that is, it contains only a tiny proportion of all the edges 
of H. This somewhat abstract statement has surprisingly many deep and interesting conse- 
quences, some of which we list in the remainder of this section. We remark that Theorem 12.21 
was partly inspired by the work of Kleitman and Winston [37j, who implicitly considered 
a statement of this type in the setting of graphs (2-uniform hypergraphs) and subsequently 
used it to bound the number of n-vertex graphs without a 4-cycle. We also note that a result 
similar to Theorem 12.21 was independently proved by Saxton and Thomason [57], who also 
use it to derive many of the statements that we present in Sections I1.1H1.3I 

1.1. The number of sets with no A;-term arithmetic progression. The celebrated 
theorem of Szemeredi [02] says that for every k E N, the largest subset of {1, . . . ,n} that 
contains no fc-term arithmetic progression (AP) has o{n) elements. It immediately follows 
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that there are only 2°*^"-' subsets of {1, . . . , n} with no /c-term AP. Our first result can be 
viewed as a sparse analogue of this statement. 

Theorem 1.1. For every positive /3 and every /c G N, there exist constants C and Uq such 
that the following holds. For every n eN with n ^ Uq, if Cn^^^^^'^^^\ then there are at 
most 




m-suhsets o/ {1, . . . , n} that contain no k-term AP. 

We shall deduce Theorem 11.11 from our main theorem, Theorem \2.2\ and a robust version 
of Szemeredi's theorem, see Section |H The sparse random analogue of Szemeredi's theorem, 
proved by Schacht ^58j and independently by Conlon and Gowers [12J, follows as an easy 
corollary of Theorem ll.li Following [12], we shall say that a set A C N is {6, k)-Szemeredi 
if every subset BOA with at least 5\A\ elements contains a /c-term AP. For the sake of 
brevity, let [n] = {1, . . . ,n} and recall that [n\p denotes the p-random subset of [n]. 

Corollary 1.2. For every 6 G (0, 1) and every k E N, there exists a constant C such that 
the following holds. If Pn ^ Cn^^^^''^^^ for all sufficiently large n, then 

lim P([n]p^ is {6, k)-Szemeredi) = 1. 

We remark that Theorem 11.11 and Corollary 11.21 are both sharp up to the value of the 
constant C, see the discussion in Section |H where both of these statements are proved. 

Our main result has a variety of other applications in additive combinatorics, see for 
example [H [2] where, jointly with Alon, we used a much simpler version of it to count sum- 
free sets of fixed size in various Abelian groups and the set [n]. In Section HJ we shall mention 
two other applications: a generalization of Theorem 11.11 to higher dimensions and a sparse 
counting version of a theorem of Sarkozy [56] and (independently) Furstenberg [2l] on square 
differences in the integers. In each case, the random version (which was proved in [T2l [58] ) 
follows as an easy corollary. 



1.2. Turan's problem in random graphs. A famous theorem of Erdos and Stone |18j 
states that the maximum number of edges in an H-fiee graph on n vertices, the Turdn 
number for H, denoted ex{n,H), satisfies 

where x{H) is the chromatic number of H. The analogue of this theorem for the Erdos- 
Renyi random graph G{n,p) was first studied by Babai, Simonovits, and Spencer [1], who 
proved that asymptotically almost surely (a.a.s. for short), i.e., with probability tending 
to 1 as n — )■ oo, the largest triangle-free subgraph of G{n, 1/2) is bipartite, and by Frankl 
and Rodl [20], who proved that if p ^ ^-1/2+e ^j^g^ ^ ^ g_ i^j^e largest triangle-free subgraph 
of G{n,p) has n'^/S + o(n^) edges. The systematic study of the Turan problem in G{n,p) 
was initiated by Haxell, Kohayakawa, and Luczak [33t and by Kohayakawa, Luczak, and 
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Rodl IH], who posed the following problem. For a fixed graph if, determine necessary and 
sufficient conditions on a sequence p G [0, 1]^ of probabilities such that, a.a.s., 

ex{G{n,p^),H) = (l - ^^^^^ _ ^ + o(l)^ (2) 

where ex{G, H) denotes the maximum number of edges in an H-bee subgraph of G. 

By considering a random {x{H) — l)-partition of the vertex set of G{n,p), it is straightfor- 
ward to show that the inequality ex [G{n,p), H) ^ (^1 — ^(^h)-i + '^(^i^ (TJP holds for every 
p G [0, 1]. On the other hand, if the number of copies of some subgraph H' (1 H in G{n,p) 
is much smaller than the number of edges in G{n,p), then the converse inequality cannot 
hold, since one can make any graph H-fiee by removing from it one edge from each copy of 
H'. This observation motivates the notion of 2-density of H, denoted by m2{H), which is 
defined by 

m2{H) = maxj ^l^^l"^ : H' C H with v{H') ^ sj . (3) 

It now follows easily that for every graph H with maximum degree at least 2 and every 
S G (O, l/{x{H) — 1)), there exists a positive constant c such that if 

db.Qb.S. 



ex[Lr[n,Pn),U) > \ i--y77^ 7 + ) ( ^ ]Pn- 

It was conjectured by Haxell, Kohayakawa, and Luczak [33j and Kohayakawa, Luczak, and 
Rodl [H] that the above simple argument, removing an arbitrary edge from each copy of 
H' in G{n,p), is the main obstacle that prevents ([2]) from holding asymptotically almost 
surely. The conjecture, often referred to as Turan's theorem for random graphs, has attracted 
considerable attention in the past fifteen years. Numerous partial results and special cases 
had been established by various researchers [2T|,[26l[29l|33l[3llllTlll3l[6T] before the conjecture 
was finally proved by Conlon and Gowers [12] (under the assumption that H is strictly 2- 
balancec^ and by Schacht [58] . 

Theorem 1.3. For every graph H with A{H) ^ 2 and every positive 6, there exists a positive 
constant C such that if Pn ^ Cn~^/'^'^^^\ then a.a.s. 



ex 



(G(n,p„).ff)«(l-;^ + .)(;)p„. 



Our methods give yet another proof of Theorem 11.31 in the case when H is 2-balanced. 
Note that most natural graphs, such as cycles and chques, are 2-balanced. In fact, we shall 
deduce from our main result. Theorem 12. 2[ a version of the general transference theorem of 
Schacht ^58j Theorem 3.3], which easily implies Theorem 1 1.31 for such graphs H. Our version 
of Schacht's transference theorem. Theorem 15.21 is stated and proved in Section We then, 
in Section [71 use it to derive a natural generalization of Theorem 11.31 to t-balanced t-uniform 
hypergraphs. Theorem 17.21 which was also first proved in [12] and |58j . 



""^A graph H is 2-balanced if the maximum in (jS]) is achieved with H' = H, that is, if m2{H) — -^^^ 
It is strictly 2-balanced if ■m2{H) > m2{H') for every proper subgraph H' C. H . 



{H)-l 
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Our methods also yield the following sparse random analogue of the famous stability 
theorem of Erdos and Simonovits [50] , originally proved by Conlon and Gowers [12] in 
the case when H is strictly 2-balanced and then extended to arbitrary H by Samotij [55], 
who adapted the argument of Schacht [58j for this purpose. 

Theorem 1.4. For every 2-balanced graph H with A(/7) ^ 2 and every positive 6, there 
exist positive constants C and e such that ifpn ^ Cn~^^'^^^^\ then a.a.s. the following holds. 
Every H-free subgraph of G{n,pn) with at least 

edges may be made {x{H) — 1) -partite by removing from it at most 6n'^pn edges. 

Similarly as with Theorem ll.3[ we shall in fact deduce Theorem 11.41 from a more general 
statement, Theorem 16. 2[ which is a version of the general transference theorem for stability 
results proved in |5^. Theorem 16.21 is stated and proved in Section |6l in Section [71 we use it 
to derive Theorem 11.41 

1.3. The typical structure of H-free graphs. Let H be an arbitrary non-empty graph. 
We say that a graph G is H-free if G does not contain if as a subgraph. For an integer 
n, denote by fn{H) the number of labeled H-free graphs on the vertex set [n]. Since every 
subgraph of an H-free graph is also H-free, it follows that fn{H) ^ 2^^^"^'^^ Erdos, Frankl, 
and Rodl [15] proved that this crude lower bound is in a sense tight, namely that 

fn{H) = 2<=^("'-f^)+°('^'). (4) 

Our next result can be viewed as a 'sparse version' of (jl]). Such a statement was already 
considered by Luczak [45], who derived it from the so-called KLR conjecture, which we 
discuss in the next subsection. For integers n and m with ^ m ^ (2), let fn,m{H) be 
the number of labeled H-free graphs on the vertex set [n] that have exactly m edges. The 
following theorem refines (jl]) to n-vertex graphs with m edges. 

Theorem 1.5. Let H be a 2-balanced graph and let 6 be a positive constant. There exists a 
constant G such that for every n 

/„,„m 

m J \ m 

In fact, we shall deduce from our main result. Theorem 12. 2[ a 'counting version' of the 
general transference theorem of Schacht [58] Theorem 3.3], which easily implies Theorem 11.51 
This 'counting version' of Schacht's theorem is stated and proved in Section [5] We then use 
it to derive Theorem 11.5! in Section [8] We remark that (j4]) was refined in a different sense 
by Balogh, BoUobas, and Simonovits [5], who showed that fn{H) = 2''^("''^)+*^("-^ c h where 
c{H) is some positive constant, and also gave a very precise structural description of almost 
all H-free graphs. We would also like to point out that our proof of Theorem II . 5 1 does not use 
Szemeredi's regularity lemma, unlike the proof given in [IS] or the proofs of Erdos, Frankl, 
and Rodl [T5j and Balogh, Bollobas, and Simonovits [S]. 

The result of Erdos, Frankl, and Rodl has, in some cases, a structural counterpart that 
significantly strengthens (jl]). For example, Erdos, Kleitman, and Rothschild [16] proved that 
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almost all triangle-free graphs are bipartite, that is, that with probabihty tending to 1 as 
n — 7- oo, a graph selected uniformly at random from the family of all triangle- free graphs 
on the vertex set [n] is bipartite or, in other words (since clearly every bipartite graph is 
triangle- free), fn{K^) is asymptotic to the number of bipartite graphs on the vertex set [n]. 
Extending this result, Osthus, Promel, and Taraz |38] proved that if m ^ Cn^/'^y/\ogn for 
some C > a/3/4, then almost all n- vertex triangle-free graphs with m edges are bipartite. 
Our next result, which is a strengthening of Theorem ll.5[ is an approximate version of this 
statement for an arbitrary 2-balanced graph H. Such a statement was also considered by 
Luczak |45j, who derived it from the KLR conjecture. Following |15], given a positive real 
6 and an integer k, let us say that a graph G is (5, /c)-partite if G can be made /c-partite by 
removing from it at most 6e{G) edges. 

Theorem 1.6. Let H be a 2-balanced graph with x{H) ^ 3 and let 6 be a positive constant. 
There exists a constant C such that if m ^ Cn^~^^™-'^^^\ then almost all H-free graphs with 
n vertices and m edges are {5,x{,H) — 1^-partite. 

Similarly as with Theorem 11.51 we shall in fact deduce Theorem 11.61 from a 'counting 
version' of the general transference theorem for stability results proved in [55j. Our version 
of it. Theorem 16.31 is stated and proved in Section [Hi In Section [S], we use it to derive 
Theorem ll.6[ Once again, our proof does not use the regularity lemma, unlike that in [45]. 
Finally, we would like to mention that, as observed by Luczak [45j, Theorem 11.61 has the 
following elegant corollary. 

Corollary 1.7. Let H be a 2-balanced graph with x{H) ^ 3 and let e be a positive con- 
stant. There exist constants C and uq such that for every n with n ^ uq and every m with 



We remark that a great deal more is known about the structure of a typical H-fiee graph 
(drawn uniformly at random from the set of all n- vertex H-free graphs), see ^ and the 
references therein for more details. 

1.4. The KLR conjecture. The celebrated Szemeredi's regularity lemma [63], which is 
considered to be one of the most important and powerful tools in extremal graph theory, 
says that the vertex set of every graph may be divided into a bounded number of parts of 
approximately the same size in such a way that most of the bipartite subgraphs induced 
between pairs of parts of the partition satisfy a certain pseudo-randomness condition termed 
e-regularity. The strength of the regularity lemma lies in the fact that it may be combined 
with the so-called embedding lemma to show that a graph contains particular subgraphs. 
The combination of the regularity and embedding lemmas allows one to prove many well- 
known theorems in extremal graph theory, such as the theorem of Erdos and Stone ^B] and 
the stability theorem of Erdos and Simonovits [HI EU], both mentioned in Section [TT^ 

For sparse graphs, that is, n-vertex graphs with o(n^) edges, the original version of the 
regularity lemma is vacuous since if the vertex set of a sparse graph is partitioned into a 




where Gn,m is a uniformly selected random n-vertex graph with m edges. 
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bounded number of parts, then all induced bipartite subgraphs thus obtained are trivially 
e-regular, provided that n is sufficiently large. However, it was independently observed by 
Kohayakawa [38] and Rodl (unpublished) that the notion of e-regularity may be extended 
in a meaningful way to graphs with density tending to zero. Moreover, with this more 
general notion of regularity, they were also able to prove an associated regularity lemma 
which applies to a large class of sparse graphs, including (a.a.s.) the random graph G{n,p). 

Given a p G [0, 1] and a positive e, we say that a bipartite graph between sets Vi and V2 
is {e,p) -regular if for every Wi C Vi and W2 C V2 with \Wi\ ^ £\Vi\ and IVF2I ^ ^|^2|, the 
density d(Wi, W2) of edges between Wi and W2 satisfies 

\diW^,W2) ~ diV^,V2)\ ^ep. 

A partition of the vertex set of a graph into r parts Vi, . . . ,Vr is said to be (£,p)-regular 
if ||Vi| — |V^|| ^ 1 for all i and j and for all but at most er^ pairs {Vi,Vj), the graph 
induced between Vi and Vj is (e,p)-regular. The class of graphs to which the Kohayakawa- 
Rodl regularity lemma applies are the so-called upper-uniform graphs. Given positive rj and 
K, we say that an n- vertex graph G is {ri,p, K)- upper- uniform if for all W C V{G) with 
\W\ ^ rjn, the density of edges within W satisfies d{W) ^ Kp. This condition is satisfied by 
many natural classes of graphs, including all subgraphs of random graphs of density p. The 
sparse regularity lemma of Kohayakawa [38] and Rodl says the following. 

The sparse Szemeredi regularity lemma. For all positive e, K, and vq, there exist a 
positive constant rj and an integer R such that for every p G [0, 1], the following holds. Every 
{e,p, K) -upper-uniform graph with at least tq vertices admits an {e,p) -regular partition of its 
vertex set into r parts, for some r G {vq, . . . , R}. 

We remark that a version of this theorem avoiding the need for the upper-uniformity 
assumption was recently proved by Scott [59] . 

The aforementioned embedding lemma roughly says that if we start with an arbitrary 
graph H, replace its vertices by large independent sets and its edges by e-regular bipartite 
graphs with density bounded away from zero, then this blown-up graph will contain a copy 
of H. To make it more precise, let if be a graph on the vertex set {1, . . . ,v{H)}, let e 
and p be as above, and let n and m be integers satisfying ^ m ^ n^. Let us denote by 
Q{H,n,m,p,e) the collection of all graphs G constructed in the following way. The vertex 
set of G is a disjoint union l^i U . . . U V^^h) of sets of size n, one for each vertex of H. For 
each edge of H, we add to G an (e,p)-regular bipartite graph with m edges between 

the sets Vi and Vj. These are the only edges of G. With this notation in hand, we can state 
the embedding lemma. Given any graph G as above, we define canonical copies of H to be 
all copies of if in G in which (the image of) each vertex i G V{H) lies in the set Vi C V{G). 

The embedding lemma. For every graph H and every positive d, there exist a positive 
£ and an integer uq such that for every n and m with n ^ uq and m ^ dn"^, every G G 
Q{H,n,m,l,e) contains a canonical copy of H . 

One might hope that a similar statement holds when one replaces 1 by an arbitrary p and 
the assumption m ^ dn^ by m ^ pdri^, even if p is a decreasing function of n. However, for 
an arbitrary function p, this is too much to hope for. Indeed, consider the random 'blow-up' 
of ii, that is, the random graph G obtained from H by replacing each vertex of H by an 
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independent set of size n and each edge of by a random bipartite graph with pri^ edges. 
With high probabihty, the number of canonical copies of if in G will be about p^^^^n'"''^^ 
and hence if p'^^^^n"^^^ <^ pn^, then one can remove all copies of H from G by deleting a tiny 
proportion of all edges. Since in the above argument one may replace H with an arbitrary 
subgraph H' C H, it follows easily that if p ^ Qji-'^/^^iH) ^ sufficiently small positive 
constant c, then there are graphs in Q{H,n,pn'^,p,e) that do not contain any canonical 
copies of H. 

As in the case of Turan's theorem for random graphs, see Section [L2| one might still hope 
that if p ^ Cn-^/^^W 

some large constant C, then the natural sparse analogue of the 
embedding lemma discussed above holds. However, it was observed by Luczak (see [201112]) 
that, somewhat surprisingly, for any graph H which contains a cycle and any function p 
satisfying p = o(l), there are graphs in Q{H,n,prP,p,e) with no canonical copy of H. 
Nevertheless, it still seemed likely that such atypical graphs comprise only a tiny proportion 
of Q{H,n,m,p,e). This was formalized in the following conjecture of Kohayakawa, Luczak, 
and Rodl [H], usually referred to as the KLR conjecture. Given a graph H, integers m 
and n, a, p E [0, 1], and a positive e, let Q*{H,n,m,p,e) denote the collection of graphs in 
Q{H,n,m,p,e) that contain no canonical copy of H. 

The KLR Conjecture. Let H be a fixed graph. Then, for any positive P, there exist positive 
C , uq, and e such that for all n and m with uq and m ^ Cn^~^/^'^'^^\ 

The KLR conjecture has been one of the central open questions in extremal graph theory 
and has attracted substantial attention of many researchers over the past fifteen years. It 
has been verified in several special cases. It is easy to see that it holds for all graphs H which 
do not contain a cycle. The cases H = K^, K^, and were resolved in |30], |28], and [29] . 
respectively. The case H = Ce has also been resolved, but here the history is somewhat more 
complex. A proof under some extra technical assumptions was given in [39]. Those extra 
assumptions were later removed in [27] and, independently, in We remark here that 
in parallel to this work, Conlon, Gowers, Samotij, and Schacht [TH] have proved a sparse 
analogue of the counting lemma for subgraphs of the random graph G{n,p), which may be 
viewed as a version of the KLR conjecture that is stronger in some aspects and weaker in 
other aspects. Our next result is a proof of the KLR conjecture for all 2-balanced graphs. 

Theorem 1.8. Let H be a 2-balanced graph. Then, for any positive f3, there exist positive 
C , Uq, and e such that for all n and m with n ^ no and m ^ Cn^"^/™^^^-', 

n, m, m/n^, e) I ^ {3' 

It is well-known that Theorem 11.81 easily implies Turan's theorem for random graphs. 
Theorem II. 3^ and also its stability version. Theorem 11.41 In fact, this was the original 
motivation behind the KLR conjecture, see [H]. Moreover, it was proved by Luczak [45] 
that Theorem 11.81 implies Theorems 11.51 and 11.61 The work of Conlon and Gowers [12] and 
Schacht [58] (see also [55]), as well as this work, have shown that one does not need to 
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appeal to the sparse regularity lemma and to the KLR conjecture in order to prove such 
extremal statements in random graphs. Nevertheless, there is still a plentitude of beautiful 
corollaries of the conjecture that cannot (yet) be proved by other means. For discussion and 
derivation of some of them, we refer the reader to \13.\- Here, we present only one corollary 
of the KLR conjecture, the threshold for asymmetric Ramsey properties of random graphs, 
which does not follow from the version of the conjecture proved in [13]. The deduction of 
this result from the KLR conjecture is essentially due to Kohayakawa and Kreuter [39]. We 
prove Theorem 11.81 in Section [91 

1.5. Ramsey properties of random graphs. Let if be a fixed graph and let r be a 
positive integer. For an arbitrary graph G, we write G — )■ {H)r if every r-coloring of the 
edges of G contains a monochromatic copy of H. It follows from the classical result of 
Ramsey [50] that Kn — )■ {H)r, provided that n is sufficiently large. Ramsey properties of 
random graphs were first investigated by Frankl and Rodl [20] and since then much effort 
has been devoted to their study. Most notably, Rodl and Rucihski [511 [52] established the 
following general threshold result. 

Theorem 1.9. Let r be a positive integer and let H he an arbitrary fixed graph that is not 
a forest. There exist positive constants c and C such that 



lim P(G(n,p„) ^ {H)r) 



In the above discussion, a copy of the same graph H is forbidden in each of the r color 
classes. A natural generalization of Theorem 11.91 would determine thresholds for so-called 
asymmetric Ramsey properties. For any graphs G, Hi, . . . , Hr, we write G — {Hi, . . . , H^) 
if for every coloring of the edges of G with colors 1, . . . ,r, there exists, for some i E [r], a 
copy of Hi all of whose edges have color i. In the context of asymmetric Ramsey properties 
of random graphs, the following generalization of the 2-density m2{-) was introduced in [39] . 
For two graphs Hi and H2, define 



= H[ C H, with .(H[) > 3} , 



(5) 



Kohayakawa and Kreuter [39] formulated the following conjecture and proved it in the case 
when all Hi are cycles. 

Conjecture 1.10. Let Hi, . . . ,Hj. he graphs with 1 < m2{Hr) ^ . . . ^ m2{Hi). Then there 
exist constants c and G such that 

{1 if n > (777-1 /'"2(-ffl,i?2) 
^ cn-V™2(Hi,i/2)_ 

More accurately, the above conjecture was stated in [39] only in the case r = 2, but the 
above generalization is quite natural, see ^B]. There had been little progress on Conjec- 
ture 11.101 until quite recently, when the 0-statement was proved by Marciniszyn, Skokan, 
Spohel, and Steger ^B] in the case where all of the Hi are cliques, and the 1-statement in 
the case r = 2 was established by Kohayakawa, Schacht, and Spohel [S] under very mild 



extra assumptions on Hi and H2. Using Theorem II. 8[ the approach of Kohayakawa and 
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Kreuter [39], which employs the sparse regularity lemma, can be adapted (see [16| Theo- 
rem 31]) to yield a proof of the 1-statement in Conjecture 11.101 for the following class of 
graphs. 

Theorem 1.11. Let Hi, . . . , Hr be graphs such that Hi is strictly 2 -balanced, H2, . . . , Hr 

are 2-balanced, and 1 < m2{H^) ^ . . . ^ m2{Hi). There exists a constant C such that if 
Pn > Cn-^/'"2(i^2,/^i)^ th^n a.a.s. 

G{n,pn) {Hi,...,Hr). 

For the deduction of Theorem 11.111 from Theorem 11.81 see [39j and [Ml Section 4]. 

1.6. Outline of the paper. The remainder of this paper is organized as follows. In Sec- 
tion [21 we state and discuss our main result. Theorem 12. 2[ which we then prove in Section [31 
In Section [H we discuss the applications of Theorem 12.21 in the context of subsets of [n] with 
no /c-term arithmetic progressions. In particular, we prove Theorem 11.11 and use it to derive 
Corollary 11.21 In Section [3, we prove two versions of the general transference theorem of 
Schacht ^58i Theorem 3.3] (obtained independently, in a slightly different form, by Conlon 
and Cowers [12]) - a 'random' version suited for extremal problems in sparse random dis- 
crete structures and its 'counting' counterpart that generalizes Theorem 11.11 In Section [6l 
we prove 'random' and 'counting' versions of the general stability result of Conlon and Cow- 
ers [12] in a form that is easily comparable with [551, Theorem 3.4]. In Section [3, we discuss 
several applications of Theorem 12.21 in the context of the Turan problem in sparse random 
graphs. In particular, using the results of Sections [51 and M we give new proofs of the sparse 
random analogues (stated above) of the classical theorems of Erdos and Stone, and Erdos 
and Simonovits, see Section [TT^ In Section [HI we discuss applications of Theorem 12.21 to the 
problem of describing the typical structure of a sparse graph without a forbidden subgraph. 
In particular, we prove sparse analogues of classical theorems of Erdos, Frankl, and Rodl and 
Erdos, Kleitman, and Rothschild, see Section 11.31 Finally, in Section [9l we use Theorem 12.21 
to prove the KLR conjecture for all 2-balanced graphs. 

2. The Main Theorem 

In this section, we present the main result of this paper. Theorem 12.21 which gives a 
structural characterization of the collection of all independent sets in a large class of uniform 
hypergraphs. We start with an important definition. Recall that a family of sets J-" C V{V) 
is called increasing (or an upset) if it is closed under taking supersets, that is, if for every 
A,B CV, A e and A<Z B imply that B e J'. 

Definition 2.1. Let "H be a uniform hypergraph with vertex set V, let J-" be an increasing 
family of subsets of V and let e G (0, 1]. We say that "H is {J-', e) -dense if 

e{H[A]) ^ eeiU) 

for every A E J^. 

A moment of thought reveals that for an arbitrary hypergraph "H and e G (0,1], it is 
extremely simple to construct families J-" C V{y{'H)) for which T-L is (J-", e)-dense. To this 
end, let 

= {A C F(H) : e(H[A]) ^ eeiU)] 
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and note that is increasing and H, is (J-'e, £:)-dense. In fact, the famihes T for which H, is 
( J-", £:)-dense are precisely all increasing subfamilies of T^. 

In this work, we will be interested in upsets that admit a much more 'constructive' and 
simpler description than that of J-'e. Many such families arise naturally in the study of 
extremal and structural problems in combinatorics. For example, consider the /c-uniform 
hypergraph "Hi on the vertex set [n] whose edges are all /c-term arithmetic progressions in 
[n] and let T\ be the collection of all subsets of [n] with at least 5n elements. Clearly, is an 
upset and it follows from the famous theorem of Szemeredi [62j that "Hi is (J-*!, e)-dense for 
some positive e depending only on b and fc, see Section |H Similarly, consider the 3-uniform 
hypergraph 'K2 on the vertex set E{K^ whose edges are edge sets of all copies of Kj, in 
the complete graph and let T2 be the family of all n-vertex graphs (subgraphs of K^^ 
with at least (1/2 — e)^^ edges such that every 2-coloring of its vertices yields at least br? 
monochromatic edges. Again, is increasing and it follows from the stability theorem of 
Erdos and Simonovits O EOj and the triangle removal lemma of Ruzsa and Szemeredi [SI] 
that 'K2 is (J-2, e)-dense, provided that e is sufficiently small as a function of 5. 

Our main result roughly says the following. If "H is a uniform hypergraph that is (J-", e)- 
dense for some family T and whose edge distribution satisfies certain natural boundedness 
conditions, then the collection X('H) of all independent sets in "H admits a partition into 
relatively few classes such that all independent sets in one class are essentially contained 
in a single set A ^ T . Before we state the result, we first need to quantify the above 
boundedness condition for the edge distribution of a hypergraph. Given a hypergraph "H, 
for each T C V^('H), we define 

deg^(T) = |{eG?/:TCe}|, 

and let 

A^CH) = max { deg^(T) : T C Vi^H) and |T| = i). 

Recall that X('H) denotes the family of all independent sets in H,. The following theorem is 
our main result. 

Theorem 2.2. For every A; G N and all positive c, d and e, there exists a positive constant 
C such that the following holds. Let T-i he a k-uniform hypergraph and let T C 'P(V('H)) 
he an increasing family of sets such that \A\ ^ ev{Ji}j for all A ^ T . Suppose that % is 
(J-", e)-dense and p G (0, 1) is such that p^^^e{l-L) ^ c'viTi) and for every £ E [k — 1], 

A,(H)^c.min{/-^/-^^}. 

Then there exists a family S C {^^p^^.^^) and functions f : S ^ and g: X('H) — )• S such 
that for every I G T{l-L), 

giI)CI and I \ g{I) C f(g(I)). 

Roughly speaking, if "H satisfies certain technical conditions, then each independent set 
J in "H can be labeled with a small subset g{I) in such a way that all sets labeled with 
some S E S are essentially contained in a single set f{S) that contains very few edges of 
"H. We remark that the constant C in the theorem has only a polynomial dependence on e. 
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Unfortunately, however, in most of our applications e will have a tower-type dependence on 
some other parameter. 

Theorem 12.21 will be proved in Section [3l We end this section with a short informal 
discussion of its consequences. As we have already mentioned. Theorem 12.21 combined with 
some classical extremal results on discrete structures has strikingly strong implications. Let 
us briefly explain why this is so. Many classical extremal problems ask for an estimate 
on the number of independent sets (of certain size) in some auxiliary uniform hypergraph. 
If applicable. Theorem 12.21 implies that all such independent sets are almost contained in 
one of very few sets that are almost independent, that is, contain a small number of copies 
of some forbidden substructure. If we know a good characterization of sets that are almost 
independent in the above sense, which is often the case, we can easily obtain an upper bound 
on the number of independent sets. For example, consider the problem of counting subsets 
of [n] with no fc-term AP and recall the definition of "Hi and J^i from the beginning of this 
section. Theorem l2.2[ applied to this pair, implies that the every subset of [n] with no /c-term 
AP is essentially contained in one of at most {o(n^-^i(''-^))) ^^^^ ^^^^ most 5n each, where 
5 is an arbitrarily small positive constant. This easily implies that if m ^ n^-'^/i^-^) ^ then 
there are at most f^*^") sets of size m with no fc-term AP. For details, we refer the reader to 



In this section, we shall prove Theorem 12.21 The main ingredient in the proof is the 
following proposition, which (roughly) says that Theorem 12.21 holds in the special case when 
T is the family of all subsets of V^('H) with at least (1 — 6)v{'H) elements. Theorem 12.21 
follows by applying Proposition 13.11 a constant number of times. 

Proposition 3.1. For every integer k and positive c and d , there exists a positive 5 such 
that the following holds. Let p G (0, 1) and suppose that % is a k-uniform hypergraph such 
that p^~^e{'H) ^ dv{'H) and for every £ G [A; — 1], 



Moreover, if for some 1,1' G X('H), goil) ^ I' and goi^') ^ I, then go{I) = go{I')- 

The final line of Proposition 13.11 states that the labeling function go exhibits a certain 
consistency. This property of go, which may look somewhat puzzling, will be crucial in the 
proof of Theorem 12.21 

In order to prove Proposition 13. H given an independent set / G X('H), we shall construct 
a sequence {B^-i, . . . , Bg) of subsets of I with |-Bfc_i|, . . . , \Bg\ ^ pv{'H), for some q E [k — 1], 
and use it to define a sequence (Tik-i, • • • , 'Hr), where r G {q,q + 1}, of hypergraphs such 
that the following holds for each i G {r, . . . , k — 1}: 

(a) Hi is an z-uniform hypergraph on the vertex set V^('H), 



Section HI 



3. Proof of the main theorem 




Then there exist a family S C (<(fc_^/)p.''y(^)) and functions /q: 5 — ?■ V{V{'H)) and go: X('H) — ?■ 
S such that for every I G X('H), 



9o{I) C / C fo{go{I))UgoiI) and |/o(^o(/))| ^ il-5)v{n). 
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(b) / is an independent set in Tii, and 

(c) eiUi) ^ Q{p^-'e{n)). 

We shall be able to do it in such a way that in the end, there will be a set A C VlH) of size 
at most (1 — 6)v{'H) such that the remaining elements of /, the set I \S, must all lie inside 
A. If r = 1, then we will simply let A be the set of non-edges of the 1-uniform hypergraph 
"Hi; in this case, the upper bound on 1^41 will follow from (jcj) and our assumption that 
p^~^e{'H) ^ c'v{'H). If r > 1, then we will obtain an appropriate A while trying (and failing) 
to construct the hypergraph "Hr-i using the hypergraph "H^ and the set Br- Crucially, this 
set A will depend solely on S", that is, if for some pair J, /' G X('H) our procedure generates 
{S^A) and (5", A'), respectively, and S = S', then also A = A'. This will allow us to set 
go{I) = S and fo{S) = A. 

3.1. The Algorithm Method. For the remainder of this section, let us fix k, c, c', p and 
Ti as in the statement of Proposition 13. 1[ Without loss of generality, we may assume that 
c ^ 1. Let / be an independent set in Ti. We shall describe a procedure of choosing the sets 
Bi O I and constructing the hypergraphs Hi as above. This procedure, which we shall term 
the Scythe Algorithm, lies at the heart of the proof of Proposition 13. 1[ 

The general strategy used in the Scythe Algorithm, that of selecting a small set S of 
high-degree vertices and using it to define a set A such that S* C / C A U S*, dates back to 
the work of Kleitman and Winston [27] , who used it to bound the number of independent 
sets in graphs satisfying the following local density condition: all sufficiently large vertex 
sets induce subgraphs with many edges. Recently, Balogh and Samotij P, ITU] refined the 
ideas of Kleitman and Winston and obtained a bound on the number of independent sets in 
uniform hypergraphs satisfying a similar local density condition. Even more recently, Alon, 
Balogh, Morris and Samotij [T] used similar ideas to bound the number of independent sets 
in 'almost linear' 3-uniform hypergraphs satisfying a more general density condition termed 
{a, i3)-stability, see Definition 16. 1[ Here, we combine, generalize, and refine all of the above 
approaches and make them work in the general setting of (J-", e)-dense uniform hypergraphs. 

At each step of the Scythe Algorithm, we shall order the vertices of a certain subhypergraph 
of "H with respect to their degrees in that subhypergraph. For the sake of brevity and clarity 
of the presentation, let us make the following definition. 

Definition 3.2 (Max-degree order). Given a hypergraph Q, we define the max-degree order 
on V{Q) as follows: 

(1) Fix an arbitrary total ordering of V{Q). 

(2) For each j G {1, . . . ,v{Q)}, let Uj be the maximum- degree vertex in the hypergraph 
G [ViG) \ {ui, . . . , ^i-i}] ; ties are broken by giving preference to vertices which come 
earlier in the order chosen in ([1]). 

(3) The max-degree order on V{Q) is (mi, . . . ,Uv{g))- 

Finally, we write W{u) to denote the initial segment of the max-degree order on V{Q) that 
ends with u, i.e., for every j, we let W{uj) = {ui, . . . , Uj}. 

We remark here that the only property of the max-degree order that will be important 
for us is that for every j G {1, . . . ,v{Q)}, the degree of the vertex Uj in the hypergraph 
G[V{Q) \ W{uj-i)] is at least as large as the average degree of this hypergraph. 
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We next define tlie numbers A^, where 1 ^ i < i ^ k, which will play a crucial role in the 
description and the analysis of the algorithm. 

Definition 3.3. For every £ e [k - 1], let = A^('H) and for all i E [k - I] and i e [i - 1], 
let 

Ai = max{2-AlX\,p-Ai+'}. (6) 
We use the numbers A^ to define the following families of sets with high degree. 
Definition 3.4. Given an z G [k], an i-uniform hypergraph Q and an £ G [z — 1], let 



M;(e)^{Te(7'):deg,(r)>f} 



Moreover, let M;(^) = E{g). 

Let b = pv{'H) and for each i G [k], let q = {ck2'^'^^y~^ . 

Properties. The key properties that we would like the constructed hypergraph "Hj to possess 
are: 

(PI) Hi is i-uniform and V{l-Li) = V{n), 
(P2) / is an independent set in "Hj, 
(P3) AeCHi) ^ A} for each £ e [i - 1], 
(P4) eCH,) ^ dp^-'e^n). 

Set l-Lk = H and note that (FlT])-(F|4]) are vacuously satisfied for i = k. The main step 
of the Scythe Algorithm will be a procedure that, given Hj+i and / satisfying (PlI])-(P|4]), 
outputs a set C / of cardinality b, a set Ai C ViTi) with the property that I \Bi C Ai, 
and a hypergraph "Hj satisfying (Pll])-(P|3]). Moreover, if the constructed Hi does not satisfy 
(PS]), then we have \Ai\ ^ (1 — Ci)v{'H). Crucially, these Ai and Hi depend solely on Bi and 
T-Li+i, that is, if on two inputs (Hj+i,/) and ("Hj+i,/'), the procedure outputs the same set 
Bi, it also outputs the same Ai and Hi. 

The Scythe Algorithm. Given an {i + l)-uniform hypergraph "Hj+i and an independent 
set / G X('Hi+i), set = Hi+i and let Tif^ be the empty hypergraph on the vertex set 
VCH). For j = 0, . . . , 6 — 1, do the following: 

(1) If / n = 0, then set Hi = nf \ Ai = and B, = {uq, . . . , Uj_i} and STOP. 

(2) Let Uj be the first vertex of / in the max-degree order on 

(3) Let "Hp"''^'' be the hypergraph on the vertex set V{y,) defined by: 

Wr' = H«u{fle(''(f)):flU{.,}e^«}. 

(4) Let ^i+t^^ be the hypergraph on the vertex set V(^A\2i) \ ^i^j) defined by: 

A\^^^^ = Ide : D n W{uj) = and T ^ D for every T E [j M;(-Hp^^) 



Finally, set Hi = Hf\ Ai = V[Af_li), and B^ = {uq, . . . ,Ub-i}- 
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We shall now establish various properties of the Scythe Algorithm. We begin by making 
some basic (but key) observations. 

Lemma 3.5. The following hold for every i E [k — 1]: 
(a) Hi is i-uniform and ViTii) = ViTi). 

(h) lflel{n,+i), thenlel{n,). 

(c) B.CICAiUB,. 

(d) The hypergraph Tii and the set Ai depend only on Tii+i and the set Bi. 

Proof. Property Q is trivial. To see (jb]), simply observe that each edge of Hi is of the form 
D \ {u} for some D G "Hj+i and u E I . Thus, if I contains an edge of Hj, it must also 
contain an edge of Hi+i. To see observe that for each j, Uj is the first vertex of I in 

the max-degree order on V^Jif^ij and hence W{uj) Pi I = {uj}. It follows that Bi ^ I 

and that I \ Ai = Bi. Note in particular that if Ai = 0, then I fl = for some 

j G {0, ... ,6}, which imphes that Bi = I. Finally, to prove ([d]), observe that all steps of 
the Scythe Algorithm are deterministic and that every element of / that we need to observe 
in order to define Ai and Hi is placed in Bi. More precisely, note that while choosing the 
vertex Uj, we only need to know the first vertex of / in the max-degree order on V^Wf^i); 
the remaining vertices remain unobserved. Since we have W{uj) H Bi = W (uj) H I = {uj}, 
this information can be recovered from Bi. Thus, at each step, the hypergraph "Hp^^^ can be 

recovered from 'H-"''' and Bi, and the hypergraph can be recovered from ^-+1, 'Hp''^^'' 

and Bi. Hence, a trivial inductive argument proves that, if the algorithm does not stop in 
step ([1]), for each j G {0, . . . , 6}, the hypergraphs 'HI''^ and ^-+1 are determined by "Hi+i and 
the set Bi, as required. Finally, the algorithm stops in step ([1]) if and only if \Bi\ < b. If this 
happens, then Hi and Ai are empty. □ 

We next show that the Scythe Algorithm exhibits a certain 'consistency' while generating 
its output. This property will be very important in the proof of Proposition 13. 1[ 

Lemma 3.6. Suppose that on inputs (Hj+i,/) and {Hi+i,!'), the Scythe Algorithm out- 
puts {Ai, Bi,'Hi) and {A[,B[,'H[), respectively. If Bi C /' and B[ C I, then {Ai, Bi,'Hi) = 
{A[,B[,H[). 

Proof. By Lemma [3.5[ it suffices to show that Bi = B[. Suppose that B^ ^ B[. Let us first 
consider the (degenerate) case when min{|i?i|, \B'j\} < b. Without loss of generality, we may 
assume that \Bi\ < b. This means that, while running on (Hi+i,/), the Scythe Algorithm 
stopped in step ([1]). By Lemma [331 it follows that Bi = I and hence B[ C Bi, which means 
that 1 5- 1 < b and therefore B[ = I'. Hence, Bi = B[, as claimed. On the other hand, if 
\Bi\ = 1 5- 1 = b, then there must exist some j such that Uj ^ u'j. Let j be the smallest such 

index. Note that by the minimality of j, we have A\2i = (-^i+i)' = -4- Since uj ^ u'j, one of 
these vertices comes earlier in the max-degree order on V{A)] without loss of generality, we 
may suppose that it is Uj. Since Bi C /', it follows that Uj G /' and hence the Algorithm, 
while running on the input {Hi+i,!'), would not pick u'^ in step j, a contradiction. This 
shows that in fact Bi = B[, as required. □ 

The next lemma motivates the definition of Ml{Q); it will be an important tool in the 
proof of Lemma I3.10[ below. 
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Lemma 3.7. For every D G Tii, there is a unique j G {0, . . . , 6 — 1} such that D U {uj} G 
^i+i- I'n other words, no edge is added to "Hj more than once. 

Proof. If an edge D is added to lii in step j, i.e., if DU{uj} G A[% then D e Mj {v!f^^^) and 

consequently all edges containing D are deleted from It follows that DU{uj>} ^J^'J 

for every j' G {j + 1, . . . , 6 — 1}. □ 

The next lemma shows that if "Hj+i satisfies (F|3]), then so does The lemma follows 
easily from the definitions of A\ and M^{Q). 

Lemma 3.8. If Ai+iiUi+i) ^ A^^^^ for some £ e [i - 1], then Ai{ni) ^ A^. 
Proof. The crucial observation is that if 

deg^o)(T)^^ 

for some T G and j G [b] , then all edges containing T are removed from v4-:^\ and hence 

no more such edges are added to Hi. It follows that deg-^.(T) = deg^(j)(T). Moreover, when 

we extend 'H'f~^^ to 'H'f\ then we only add to it sets D such that D U {m^} G — "^j+i 

and hence 

deg^O)(T) - deg^,,.i)(T) ^ deg«^^^(T U {u^}) ^ A\T\+i{n,+i) . 

It follows that 



A.iUi) ^ ^ + A,+i(?^,+i) ^ ^ + A^+1 ^ A^ 

where the last inequality follows from (j6]). □ 

Next, let us establish some simple properties of the numbers A\. 

Lemma 3.9. The following inequalities hold: 
(a) A*^"*^ ^ c2^p^^ for every i E[k — 1] and 
(h) A*i ^ c2V"*^ for every i G {2, . . . , fc}. 

Proof. To prove the lemma, simply note that, by the definition of A^, for every i G [k] and 
every £ G — 1], 

A* = 2V~'"'^Ad+K^) for some dG {0,...,A;-i}. (7) 

One easily proves ([?]) by induction on k — i. Intuitively, c? in ([7]) is the number of times 
that the first term in the maximum in is larger than the second term when following the 
recursive definition of A^ back to 

Since A^('H) ^ c ■ min as in the statement of Proposition 13. it follows 

from ([7]) that 

A^+i = max |2V"^'^^^"'^Ad+i(^)l ^ max |2V"'"^"'' • cZ+^'H ^ c ■ 2 V\ 

and 

Al = max {2V-*-'^Ad+i(H)| ^ max UY''-'' ■ cp'' ■ ^ c ■ 2^''^^, 
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as required. □ 

Finally, we show that if T-ii+i satisfies (F|3]) and (FS]), then either "Hj+i also satisfies (FS]) 
or we have \Ai\ ^ (1 - Ci)v{'H). Recall that Ci = {ck2^+^y-''. 

Lemma 3.10. Let i E [A; — 1] and suppose that e{'Hi+i) ^ Ci+ip''^^^'^^^ eiH) and that 
A^(7/j_i_i) ^ A^^^ for every i E [i]. Then either 



or \A,\ < {1-Ci)v{n). 



Proof. If the Scythe Algorithm stops in step ([T]), then |y4j| = and there is nothing to prove. 
Hence, we may assume that steps ©-(jl]) are executed b times. Since no edge is added to 
Hi more than once, see Lemma 13.71 then for each j G {0, . . . , 6 — 1}, 

e(?^?+^Ve(H?))=deg,o)(«,). (9) 



i+i 



By the definition of the max-deg order, the right-hand side of (Q is at least the average 
degree of the hypergraph A^'^i, the subhypergraph of aI'^i induced by the set (V[A\''^i) \ 
W{uj)) U {uj}. Therefore, by the definition of ^-i^i.^^^ we have 

Hence, if {i + l)e(^J^t^^) > e(Hi+i) for every j G {0, . . . , 6 - 1}, then 



^ v{n) v{Wj 



vytt) 

as required. Thus, we may assume that for some j, 

< < (10) 

Intuitively, ( ITOl) means that while running the Scythe Algorithm on T^i+i and /, many edges 
are removed from Ai+i (that is, "Hj+i) in step (jlj). This may happen for one of the following 
two reasons: either many of the initial segments W{uj) are long or one of the families Ml{l-Li) 
of sets with high degree in "Hj is large. 

Claim. Either 

6-1 

3=0 1 

or for some £ G [i]. 
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Proof of claim. Recall that A[^i = "Hi+i and observe that for every j G {0, . . . , 6 — 1}, 

i 

e(A^i\) - e{A^^) ^ \W{u,)\ ■ A.CH.^,) + |m;(?/P+^)) \ MliV^^) | ■ A,(?^.+i). (11) 

e=i 

Inequality (ITTl) follows since in step (jlj) of the Scythe Algorithm, we remove from only 
the edges that contain either a vertex of W{uj) or a member of MI(T-L\''^^^) for some £ G [i]. 
Thus, since AeiHi+i) ^ A\^^ for every i G [i], summing ( ITTi) over all j, we get 

6-1 i 



i=o e=i 



■ 

Since we assumed that e(^-^\) < e{^ij^\) /(i + 1), see (fTO|), and Ui = nf\ it follows that if 

2(i + l) -^'-^+1)' 



i=o 
then 

|m;(-H,) I ■ A^+i ^ ^^-^ ■ eCH,+i) for some ^ G [2], 

as claimed. □ 

Finally, let us deal with the two cases implied by the claim. In the remainder of the proof, 
we will show that if Ml{T-Li) is large for some £ G [i], then e{T-ii) is large and if Y^j^o 
is large, then \Aj\ is small. 

Case 1: \Ml{H^) \ ^ i^^^^^ ■ eiU^+i) for some £ G [z]. 

If £ < i, then deg^. (T) ^ A.\/2 for every T G M^(7ii), so by the handshaking lemma, 

^(^^) = J deg^.(^) ^ ^ ^Tiv • (12) 



Tg 



2n 



Recalling that A} ^ pAl'^^, see (jH]), we have 



eCHj+i) _Ai ^ p N ^ P 

4(^ + 1)0 



as required. On the other hand, if £ = then recalling that A-"*"^ ^ c2''p ^, see Lemma [3. 9 [ 
we have 

as required. 

Case 2: E -li \Wiu,)\ ^ ^ ■ e(?/.+i). 
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We claim that in this case, \Ai\ ^ (1 — Ci)v{l-L). Indeed, we have 

6-1 

v{:H) - \A,\ = v{Af^,) - v{Afl,) = J2 \W{u,)\ 



j=0 

Recall that A*+^ ^ c2V''~*"^^ by Lemma EH Thus 



since e(7/i+i) ^ Ci+ip'=-(*+^)e(?{) and Ci+i/(c2^+2) ^ □ 
3.2. The proof of Proposition [3.11 and Theorem 12.21 

Proof of Proposition \3.1[ Let k be an integer and let c and d be two positive constants. 
Furthermore, let p G (0, 1) and let be a fc-uniform hypergraph that satisfy the assumptions 
of Proposition 13.11 Let 5 = {ck2^^^Y^^d and let h = pv{'H). We shall use the Scythe 
Algorithm, described in Section 13.11 to construct a family S and functions /o and go as in 
the statement of Proposition 13. 1[ We shall obtain them by running the following algorithm 
(with Hk = "H) on every independent set / G X('H). We shall define /o somewhat implicitly 
by defining a function /q : X('H) — >■ ViyCH)) that is constant on the set g^'^iS) for every 
S eS. 

Constructing go and /q. Given an J G X('H), set i = A; — 1 and repeat the following: 

(1) Apply the Scythe Algorithm to Tii+i and /. Suppose that it outputs Hi, Ai and Bi. 

(2) If \Ai\ ^ (1 - 5)v{H), then set g = i, r = z + 1 and STOP. 

(3) If z > 1, then set i = z — 1. Otherwise, set g = r = 1 and STOP. 

Let / be an independent set and let us execute the above procedure (with 1-ik = Ti) on 
/. We claim that for every i G {r, . . . , fc}, the hypergraph Tii satisfies properties (FlT])-(F|ll) 
defined in Section 13.11 This follows by induction on k — i. The base of the induction, the 
case i = k, follows vacuously from the definitions of and for £ G [fc — 1]. The inductive 
step follows from Lemmas 13.51 13.81 and 13. 101 To see this, note that since \Ai\ > {l — 6)v{'H) ^ 
(1 — Ci)v{'H) for alH G {r, . . . , A; — 1}, then ([H]) in Lemma [3.101 always holds. 

Now, let us define go{I) and foil)- Suppose first that r > 1 and note that in this case, 
the algorithm stopped in step ([2]), which means that \Aq\ ^ (1 — 6)v{'H); we set 

go{I) = Bk-iU ...UBg and /*(/) = A,. 

On the other hand, if r = 1, then we set 

^7o(/) = Bfe-iU...Ufii and f*{I) = {veV{'Hi):{v}^ni}. 

Finally, we let 

S = {go{I):IeI{'H)}. 

We will define /o by letting /o(5') = /o(/) for some / G gQ^{S). We first show that this 
definition will not depend on the choice of /. In fact, we shall prove a slightly stronger 
statement, which also establishes the consistency property of go stated in the final line of 
Proposition 13. 1[ 
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Claim. Suppose that for some J,/' G X('H), go{I) C /' and go{I') C /. Then go{I) = go{I') 
and /*(/) = /o*(/'). 

Proof of claim. Suppose that while running the algorithm on some /, we obtain a sequence 
{Bk-i, . . . , Bq). Since goil) depends solely on {Bk-i, . . . , Bq) and, by Lemma [375| for each i, 
the hypergraph "Hj and the set Ai depend only on (-Bfc-i, • • • , Bi), then also /o (/) depends 
solely on (-Bfc_i, . . . , Bq). Hence, it suffices to show that if, while running the algorithm on 
some /' with B^^iU. . .UBq C J', we obtain a sequence (-B^.i, • • • , B'^,) with B[_^U. . -UB'^, C 
/, then • • • , B'^,) = (-B^-i, • • • , Bq). To this end, let us first observe that, under the 

above assumptions, for every i G [A; — 1], if "Hj+i = 'H[^i, then Bi = B[. Indeed, note that 
Bi and B[ are the outputs of the Scythe Algorithm executed on the inputs (Tii^i,!) and 
{'Hi_^i,I'), respectively. Hence, if Tii+i = then since 

Bi C Bk-i U...[JBqCI' and B- C B'^_^ U . . . U 5^, C /, 

then Lemma 13.61 implies that Bi = B'i. Since clearly Tik = "H^ = Ti and, as noted before, 
for each i, "Hj+i depends only on {B^-i, . . . , Bi+i), it follows that Bi = B'i for all i, as 
required. □ 

By the above claim, we can define /o by letting, for every S* G 5, f{S) = /o(/) for any 
/ G gQ^{S). Finally, let us show that the S, go, and /o, which we have just defined, satisfy 
the required conditions, that is, for all /, /' G X('H), 

(i) 1^1 ^{k- l)pv{H) for every S eS, 

(ii) goil) ^I^foigoiI))U goil), 

(iii) \foigoiI))\^il-S)vin), 

(iv) goil) C /' and goil') C / imply that goil) = goil')- 

To see simply recall that \Bi\ ^ pviTi) for every i G [A; — 1]. To see note that 
Bi O I G Ai U Bi for every i G {q, . . . ,k — 1}, by Lemma 13. 5[ that / is an independent 
set in "Hi (if r = 1) and, crucially, that foigoil)) = foi^)- To see dm]), note that if r > 
1, then \Aq\ ^ (1 — 6)vi'H), see step ([2]) of the algorithm; if r = 1, then observe that 
|{f G ViHi): {v} ^ "Hijl ^ (1 — S)vi'H) since "Hi satisfies property (FSD and hence 

eiUi) ^ cip^-^eiH) ^ Cic' ■ = 6viH), 

where the second inequality follows from our assumption that p^'^eiTi) ^ c'viH). Finally, 
fliv|) follows directly from the claim. □ 

Proof of Theorem The theorem follows by applying Proposition 13.11 a bounded number 
of times. Given an integer k and positive reals c, c' and e, let 6 = ^^^c/e,ec') and let 

C7=(fc-1)- Qlog^ + 1 

Let be a finite set and let J-' be an increasing family of subsets of V such that \A\ e\V\ 
for every A G J-". Let p G (0, 1) and suppose that "H is a fc-uniform hypergraph on the 
vertex set V that is (J-", e)-dense and satisfies the assumptions of the theorem, that is, 
p^-^eiH) ^ c'viV.) and 

eivyy 

vi 



A,(H)^c•min<;/-^/-l-^J 
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for every £ G [A; — 1]. We now show how to construct a family S C (<^p^^-)) and functions 
f : S J-" and 5^ : X('H) — )■ 5 such that 

giI)CI and / \ (7(1) C /(^(/)) (13) 

for every / G X('H) . Similarly as in the proof of Proposition 13.11 we shall define / via a 
function /*: liTi) — ?■ ViV) that is constant on each set g~^{S) with S E S. 

Fix some / G I{'H). Using Proposition 13. H we shall construct a sequence {Aj,Sj)j^i of 
pairs of subsets of V such that for each j, 

SiU . . .u Sj c I c AjU SiU . . .u Sj. 

Moreover, Aj E T while |S'i U . . . U S*,/] ^ Cpv{'H). Crucially, the set Aj will depend solely 
on SiU ...USj. We will let g{I) = 5i U . . . U 5j and /*(/) = Aj. 

Construction. Let S'o = and let Aq = V. For j = 0, 1, . . ., do the following: 

(1) If Aj G J-", then let Ij = I H Aj and apply Proposition 13.11 with = c/e, ( j^-jj = ec' 
and ^J37i]= P to the hypergraph '^[Aj] and the set Ij to obtain sets go{Ij) and fo{go{Ij)) 
such that go{Ij) ^ Ij and Ij \ go{Ij) C fo{go{Ij)). Otherwise, if Aj G J-", then STOP. 

(2) Let Sj+i = go{Ij) and let Aj+i = foigoilj))- 

Let us first show that the above procedure is well-defined, that is, that the assumptions 
of Proposition 13.11 are satisfied each time we are in ([1]). To this end, fix some A O V and 
note that if A G J-", then, since H is (J-", £)-dense, 

p^-^ein[A]) ^ ep'^-'ei'H) ^ ec'v{n) ^ ec'v{n[A]) 

and 

Mn[A]) < M-H) ^ c ■ min ^ ^ " min {p'-' ^^''^J^} ■ 

Next, let us show that the above procedure terminates, therefore producing a finite sequence 
{Aj,Sj) with j G [J]. To this end, let us simply note that by Proposition 13. H |v4j4.i| ^ 
(1 — 5)|v4j| for all j, Aq = V and |y4| ^ e\V\ for every A E J^. Moreover, since Aj_i E T , 
then 

e\V\ ^ ^ (1 - by-^\A^\ = exp(-(J - 1)6)\V\ 

and hence </ ^ | log ^ + 1- It immediately follows that 
J J 

j=i i=i 

Finally, let S = {g{I)'- I E X('H)}. It remains to show that for every S E S, f* is constant 
on g^^{S). Similarly as in the proof of Proposition 13.11 we shall prove a somewhat stronger 
statement. 

Claim. Suppose that for some /, /' G X('H), g{I) C I' and g{I') C /. Then g{I) = g{I') and 
/*(/) = /*(/'). 
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Proof of claim. Suppose that while running the above procedure on some /, we generate a 
sequence {Aj, Sj)j^^. Since for each j, Aj+i depends solely on Aj and Sj^i, where Aq = V, 
then both g{I) and /*(/) depend solely on (5*1, . . . , Sj). Hence, it suffices to show that if, 
while running the above procedure on some /' with S*! U. . . U5'j C J', we generate a sequence 
(4, S'j)/li with U . . . U C J, then (^i, ...,Sj) = {S[, S'j,). To this end, it suffices 
to note that if Aj = A'j, then, since 

Sj+i CSiU...USjCI' and S'j^^ C S[U . . . U S'j, C I , 

by the consistency property of go stated in the final line of Proposition 13.1^ 5*^+1 = Sj^i- 
Since Aq = Aq = V and for each j, Aj depends only on {Si, . . . , Sj), it follows that Sj = Sj 
for all j, as required. □ 

Finally, for every S E S, we let f{S) = f*{I) for some I G g^^{S). This completes the 
proof of Theorem 12. 2[ □ 

4. SZEMEREDI'S THEOREM FOR SPARSE SETS 

In this section, we prove Theorem 11.11 and derive from it Corollary 11.21 Before we get to 
the proofs, let us ffist remark that Theorem 11.11 and Corollary 11.21 are both sharp up to the 
value of the constant C in the lower bounds for p and m. More precisely, let us make the 
following two observations. 

(1) For every /3 G (0,1), there is a positive c such that if m ^ cn^~^^^''~^\ then the 
number of m-subsets of [n] that contain no fc-term AP is at least (1 — • To see 

this, let e = (3'^ and observe that if c is sufficiently small and m ^ cn^^^/^^~^\ then 
the expected number of fc-term APs in a random (1 + £:)m-subset of [n] is smaller 
than em/2 and hence by Markov's inequality, at least half of all (1 + £)m-subsets 
of [n] contain a subset of size m with no fc-term AP. Hence 



# 




(2) There is a positive constant c such that if pn ^ cn^^/'^^^'^\ then 

^(Npn is ((^5 /i;)-Szemeredi) — as n — )■ oo. 

For a (simple) proof of this statement, we refer the reader to [58] . 
We shall in fact prove the following somewhat stronger version of Corollary II. 2[ originally 
proved by Schacht [58] (the approach of Conlon and Gowers [12] yields a somewhat weaker 
probability estimate). 

Corollary 4.1. For every A; G N and every 5 G (0, 1), there exists a constant C such that 
for all sufficiently large n, if p ^ Cn~^^^''^^\ then 

P([n]p IS {5, k)-Szemeredi) ^ 1 — 2exp(—pn/8). 

In the proofs of Theorem 11.11 and Corollary 14.11 and frequently in later sections, we shall 
need various estimates on binomial coefficients, which we list here for future reference. Let 
a, b, and c be integers satisfying a ^ 6 ^ c ^ 0. Then the following inequalities hold: 
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a\ ea 



&\ ( h\'^ ( a 



a 



We remark that each inequahty above follows easily from the definition of (^) . 

Proof of Corollary \4l\ Fix k e N and 6 G (0,1), let (3 = 6/{2e) ■ e'^l^ and set C = 
2 G^l^ (3, k)/5. Assume that p ^ Cn^^/'^'^^^\ let m = 5pn/2, and let Xm denote the number 
of fc-term-AP-free m-subsets of [n]p. By Theorem 11.11 and f|T^ . we have 



P(x.„>OK.,xj,(^:),..,(^)"^(af£)' 



Let A denote the event that [n]p is not (5, /i;)-Szemeredi, i.e., that there exists a A;-term- 
AP-free subset of [n]p with 5|[n]p| elements. By (fTSj) and Chernoff's inequality (see, e.g., [Sj 
Appendix A]), it follows that 

P(^) ^ P A I [n]p\ ^ ^) + P (^1 [n]p\ < y) ^ > 0) + 6"^"/^ ^ 2e-^"/«, 

as required. □ 

Finally, let us show how to deduce Theorem 11.11 from Theorem 12.21 Our proof will use the 
following robust version of Szemeredi's theorem, which can be proved by a simple averaging 
argument, originally observed by Varnavides [65]. 

Lemma 4.2. For every positive 6 and k & [n], there exists a positive e such that the following 
holds for all sufficiently large n. Every subset of [n] with at least 5n elements contains at 
least en^ k-term APs. 

Proof of Theorem \l.l[ Given A; G N and positive /3, let 5 = min{/3/2, 1/10} and let n G N 
be sufficiently large. Let T-L be the fc-uniform hypergraph of fc-term APs in [n], i.e., the 
hypergraph on the vertex set [n] whose edges are all /c-term APs in [ra], let J-" denote the 
family of subsets of [n] with at least 5n elements, and let e = e^^6,k). By Lemma 14.21 
the hypergraph H is ( J-", £:)-dense, provided that n is sufficiently large. Let p = n~^^^''~^\ let 
c' = l/k"^, and let c = 2k'^. Since e{n) ^ cV, it follows that p^-^e{H) ^ dv{n). Moreover, 

A^(7^) ^k--^ ^ cdn ^ c•min|p^^^J9l-l''^^^ 



k-\ Y v{U) 

and for every £ G {2, . . . , A; — 1}, 

A.(H) ^ (2) ^ cc'nVe^-i) ^ ^ . {^'"''^'"'^} • 

Let C = G^^ k, e, c, c') and let C = C'/S and assume that m ^ Cn^^^/^''^^\ Note that if 
m > 6n/2, then X{'H,m) = by Szemeredi's theorem, so we may assume that m ^ 6n/2. 
Since C'pn ^ 6m, then by Theorem 12. 2^ there exists a family S C {^^j^,^ C (J^j^^) and 
functions f-.S—^J-" and g: X('H) — ?■ S, such that for every / G X(H), 

giI)CI and I \ gil) C f{g{I)). 
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Therefore, using f|T^ and f[T6|) . the number of independent sets of size m in 7/ can be 
estimated as follows: 

|X(H,m)| = GX(H,m): g{I) = 5}K ( '^^^M 



SG5 Se<S 

^ \k J \m — k J ^ V/c/ \Sn — m J \m 



Since m ^ 6n/2 and the function x ^-> {y/xY is increasing on (0,?//e), it follows that 

'«« C)"(rc)<(:)^ 

where the final inequality follows since (^) ^ 2~'^{^^^), by ([I5D, and since 2^'^ > 2e/6'^ if 
^ ^ 1/10. This proves Theorem II .11 □ 

Finally, from the same proof, combined with an analogue of Lemma 14.21 due to Furstenberg 
and Katznelson ^5] , we obtain the following generalization of Theorem 11.11 which strength- 
ens Theorem 2.3 of [58]. Given a set F C N^, we call a set of the form a + bF = {a + bx: x & 
F}, with a G and b &'L \ {0}, a homothetic copy of F. 

Theorem 4.3. For every positive [i, every £ G N, and every finite configuration F C N^, 
there exist constants C and hq such that the following holds. For every n E N with tiq, 
if m ^ Cn^~^^^^^^^^\ then there are at most 



m 

m-subsets of [nY that contain no homothetic copy of F. 



Finally, we mention one more straightforward application of Theorem l2.2l which is a sparse 
version of a result of Sarkozy and Furstenberg [21] on square differences. A robust version 
of it was proved by Hamel and Laba [32], Theorem 3.1] using a Varnavides-type averaging 
argument. The following theorem improves Theorem 1.2 of 



Theorem 4.4. For every positive P, there exist constants C and hq such that the following 
holds. For every G N with n ^ hq, if m ^ C^fn, then there are at most 

m 

m-subsets of [n] that contain no pair {x, y} such that x — y is a perfect square. 

5. Extremal results for sparse sets 

In this section, we shall deduce from Theorem 12 . 2 1 two versions of the general transference 
theorem of Schacht ^58i Theorem 3.3]. We remind the reader that a statement very similar 
to Schacht's theorem was proved independently by Conlon and Gowers [12]. For the benefit 
of the readers who are familiar with [58], we shall state it using the terminology used there. 
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Definition 5.1. Let V, = {'Hn)nm be a sequence of fc-uniform hypergraphs and let a G [0, 1). 
We say that is a-dense if tlie following is true: For every positive 6, there exist positive 
e and uq such that for every n with n ^ uq and every U C ViTin) with |f/| ^ (a + 6)v{'Hn), 
we have 

Let us remark here that Definition 12.11 is a generalization of Definition 15.11 Indeed, if 
J^s denotes the collection of all subsets of ViTin) with at least (a + 6)v{'Hn) elements, then 
a sequence "H of hypergraphs is a-dense if and only if for every positive S, there exists a 
positive e such that for all sufficiently large n, the hypergraph Tin is (J-^, £:)-dense. 

We start with the 'random' version of our extremal result, which is a slight weakening of 
Theorem 3.3 of [58], see the discussion below. 

Theorem 5.2. Let be a sequence of k -uniform hypergraphs, let a G [0, 1), and let c and 
c' be positive constants. Suppose that p G [0,1]^ is a sequence of probabilities such that for 
all sufficiently large n we have p^~^e{l-Ln) ^ c'viTin) and for every £ G [A; — 1], 

A,(?/„) ^c•min|p^^p^^^}. (19) 

// "H is a-dense, then the following holds. For every positive 6, there exists a constant C 
such that if qn ^ Cpn and qnviTin) oo as n ^ oo, then a.a.s. 



a 



We note that the probability bounds implicit in the 'asymptotically almost surely' state- 
ment that we obtain are, as in [5Fj, optimal, that is, they decay exponentially in PnuiT-Ln)- 

Remark 5.3. We remark that the only difference between Theorem 15.21 and Theorem 3.3 
of [58j are the assumptions on the hypergraph sequence Ti, which are somewhat more re- 
strictive here. In fact, it turns out that the conditions p^~^e('H„) ^ c'viT-Ln) and AeiJ-Ln) ^ 
c for every £ E [k — I] are essentially equivalent to the condition that "H is {K, p)- 

bounded (see below), whereas the condition Ai{T-Ln) ^ cpf^^ for every £ G [A; — 1], which is 
essential in the theorem above, is not needed in [58]. 

To be more precise, let us first recall from [58j that a sequence of fc-uniform hypergraphs 
T-L is said to be {K, p)-bounded if 



/^i('H„,g) = E 



for every i G [/c — 1], every q with q ^ p„, and every sufficiently large n, where V abbreviates 
V{l-Ln) and degj(f,V^) denotes the number of edges of T-in which contain v and at least i 
other vertices of V^. We claim that if (a) p^"^e(Hn) ^ c'w(H„) and (6) A^(H„) ^ c-p^-^^j^ 
for every £ G [A; — 1], then "H is (i^, p)-bounded for some constant K that depends only on 
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c', and k. Indeed, we have 

2 



e=o V * / 



for every i G [/i: — 1] and q with g ^ p„, where we used (6) to bound both Ai('Hri,) and 
Af except in the case i = i = k — 1, where we used (a) and the fact that AkiTin) = 1- 
Conversely, suppose that Ti is {K, p)-bounded and observe that fik-ii'HnyPn) ^ Pn~^e('H„). 
Thus, setting c' = it follows that pJi~^e(Hn) ^ c'viTin)- Moreover, we claim that for 

every positive e, there is a constant c that depends only on K, k, and e, such that for 
all sufficiently large n, the hypergraph Tin contains a subhypergraph Ti'^^ C "Hn satisfying 
e(K) ^ (1 - ^)e(H„) and 

A,(H;) ^ c ■ P^'^l^ for every i e [k - 1]. (20) 

Indeed, fix a large constant c and suppose that T-Ln does not contain a subhypergraph with 
at least (1 — e)e{'Hn) edges that satisfies fl20l) . In this case, let us greedily construct a 
hypergraph "H" as follows. Start with T-L'^ = T-Ln and T-L'^ empty. Whenever there is an £-set 
T C ViTin), for some i E [k — 1], whose degree in Ti'^ exceeds the right-hand side of fl2U]) . 
then move an arbitrary edge containing T from to H'^. By our assumption, when the 
process terminates, H'^ will contain more than eeiTin) edges. Now, if c is sufficiently large 
(as a function of e, k, and K), then for some i E [k — 1] we have 



MHn,Pn)^j^_^ C P^ ^^^^^ P„ >Ap„^^^^ 



which is a contradiction. 

Finally, note that, trivially, if Hn is ( J-", 2£)-dense for some family C V(y{Hn)), then 
every T-t'^ with e('H^) ^ (1 — e)e{'Hn) is ( J-", £)-dense. It follows from the above discussion 
that, up to the value of the involved constants, the only difference between Theorem 15.21 and 
Theorem 3.3 of [58] is in the additional assumption that A^{'H„) ^ cp^~^ for every £ e 

Our methods also yield the following 'counting' analogue of Theorem 15. 2 ^ a generalization 
of Theorem 11.11 that does not follow from the methods of Schacht [58] or Conlon and Cow- 
ers [12] and, in the case a = 0, can be thought of as a strengthening of Theorem 15.21 see 
Corollary O 

Theorem 5.4. Let be a sequence of k -uniform hypergraphs, let a G [0, 1), and let c and 
c' be positive constants. Suppose that p G [0,1]^ is a sequence of probabilities such that for 
all sufficiently large n eN, we have p^~^e{l-Ln) ^ c'viTin) and for every i E [A; — 1], 



A,(H„) ^c-min>r ,P: 
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// l-L is a-dense, then the following holds. For every positive 6, there exists a constant C 
such that for all sufficiently large n, if m ^ CpnviT-Ln), then 



[a + 5)v{'Hr, 



m 



Proof of Theorem \5.'A Let a G [0,1), let /c G N, let p G [0,1]^, and let "H be a sequence 
of fc-uniform hypergraphs as in the statement of Theorem 15. 2[ Furthermore, suppose that 
% is a-dense and fix some positive 5; without loss of generality, we may assume that 5 
is sufficiently small. Let n G N be sufficiently large, let 5' = 6/3, and let J-" denote the 
family of all subsets of V{l-Ln) with at least (a + 6')v{'Hn) elements. Since Hn is a-dense, 
it follows that Tin is (J-", £:)-dense for some small positive e that does not depend on n. Let 
C = G^^k,€,c,c'). By Theorem 12.21 there exist a family S C (<^^''^(^ ^) and functions 
f : S and g : X{l-in) — ^ S such that 

g{I)<ZI and I \ g{I) <Z f{g{I)) 

for every / G X{T-Ln)- Let C = C'/6^ and assume that g„ ^ Cpn- Let m = (a + S)qnv{'H„) 
and, for the sake of brevity, let us write V = V(T-Ln) and q = qn- Observe that 

F(a{nn[Vg]) ^rnj = p(/ C Vg for some / G X(H„,m)) (21) 

^ ^ P(^/ C for some / G X(7^n, m) such that g{I) = . 

Fix an S" G iS and let = {/ G liTinym) : g{I) = S}. We estimate the summand in the 
right-hand side of (!2T]) as follows: 

P(^/ C Vg for some / G X^^) ^ P(5 CVg) ■¥ {\Vg f] f{S)\ ^ m - \S\) . (22) 

To see the above inequality, simply note that for every / G X^, we have / \ S* C /(S*). 
Now, since m = (a + 3(5')g„f ("H^) and 5 G iS, then 

and hence m — \S\ ^ (a + 2S')qv{'Hn)- On the other hand, since 1/(5") | ^ (a + 6')v{'Hn) by 
the definition of J-", then 

E[|Kn/(5)|] ^(a + 5')qv{nn). 
Hence, by Chernoff's inequality, we have 

P (in n fiS)\ , ,n - < exp . exp (-^!^) 

Finally, note that since \S\ ^ 5^qv{l-in) for every S* G 5, and using dHj), 



(23) 



e ^ S^qv{Hn) 



(24) 
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Putting (Ell), dZg), (ES]), and ([24]) together, we obtain 

P(«(^nK]) >m)^J2^{SC V,) exp ^ exp ( - S'qviHn)), 

as required. □ 

Proof of Theorem \5.4\ Let a G [0, 1), let A; G N, let p G [0, 1]^, and let H be a sequence of 
fc-uniform hypergraphs as in the statement of Theorem I5.2[ Furthermore, suppose that T-L is 
a-dense and fix some positive 6. Let n be sufficiently large, let 6' = 6/2, and let J-' denote 
the family of all subsets of ViTin) with at least {a + 6')v{'Hn) elements. Since Tin is a-dense, 
it follows that Tin is (J-", e)-dense for some small positive e that does not depend on n. Let 
C = G^^k,e,c,c'). By Theorem 12.21 there exist a family S C {^(^p^^"(l^ )) and functions 
f : S ^ and g : X{l-Ln) — )■ S such that 

g{I)<ZI and I \ g{I) <Z f{g{I)) 

for every / G I{7in)- Let C = C'/S"^ and assume that m ^ CpnviT-Ln)- Fix an S* G 5, let 
X5 = {/ G li/Hn, rri) : (7(1) = 5}, and note for future reference that 

|5| ^ C'pnviUn) ^ (25) 

Since f{S) G J, we have \ f{S) \ < (a + 5')v{Hn)- Therefore, 

1/(5)1 \ / /(« + ^')^(^^ 



l^^l ^ IQl H \ IQI 

\m — |o 1/ \ m — \b\ 

To see the above inequality, simply note that for every / G Xg, we have I \ S C f{S). 

Now, observe that if m ^ (a + 6')v{'Hn), then every m-subset of V{T-Ln) belongs to and 
hence there is no independent set of size m. Therefore, from now on we may assume that 
m < (a + 5')v{V.n) = (a + 6/2)v{V.n). It follows, using ([15]) and ([H]), that 

^ /(a + 5')^(^n)A ^ f 2m \ _ /(« + 5)t;(H„)\ _ 

Setting s = 15*1 and recalling that s ^ by fl25]) . we obtain 

w(H„)\ ^ /2emy . _^,.„/2/(a + 5)t;(H„)\ ^ ^,52^ /(a + 

s/ \ m J ^ \ m 

provided that 6 is sufficiently small. It follows that 

|X(?^„,m)| = 2^1X5! ^ 2^ I 1 maxjlXsh 1^1 = s} ^ I ^ 1, 

as claimed. □ 
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6. Stability results for sparse sets 



In this section, we shall deduce from Theorem 12 . 2 1 two versions of the general transference 
theorem for stability results proved by Conlon and Gowers [12] • Similarly as in Sectional 
we shall state our results using the terminology used by Schacht \5E]. We remark here that 
in parallel to this work, Schacht 's method was adapted to yield sparse random analogues of 
stability statements by Samotij [55j. The main result of this section is most easily compared 
with Theorem 3.4 of [55]. We begin by recalling the following definition from [1]. 

Definition 6.1. Let be a sequence of /c- uniform hypergraphs, let a be a positive real 
and let be a sequence of sets with Bn C V{V{Hn)). We say that Ti is {a, B)-stable if for 
every positive 6, there exist positive e and no such that the following holds. For every n with 
n ^ uq and every U C ViTin) with \U\ ^ (a — e)v{'Hn), we have either e('H„[f/]) ^ ee{l-in) 
or \U\B\^ 5v{Un) for some B G 

Roughly speaking, a sequence "H of hypergraphs is (a, i3)-stable if for every A C V{l-in) 
that is almost as large as av{l-in)-, the set A is either very 'close' to some extremal set B & Bn 
or it contains 'many' (a positive fraction of all) edges of T-Ln- Note that in many natural 
settings, such a property does hold, for example, as a consequence of the Erdos-Simonovits 
stability theorem [El EO] and the removal lemma for graphs. 

We again start with the 'random' version of our stability result, which is a slight weakening 
of Theorem 3.4 of [55], see the discussion below Theorem 15.21 

Theorem 6.2. Let be a sequence of k -uniform hypergraphs, let a G (0, 1), and let c and 

c' be positive constants. Let p be a sequence of probabilities such that p^^ e{'Hn) ^ c'v{'Hn) 
and, for every £ G [A; — 1], 



and let B be a sequence of sets with Bn ^ V{y{'Hn))- 

If T-i is {a,B)-stable, then the following holds. For every positive S, there exist e and C 
such that if qn ^ Cpn and qnviTin) oo as n ^ oo, then a.a.s. every independent set 
I C V{'Hn)g„ with \I\ ^ (a — e)qnv{'Hn) satisfies \I \B\ < 5qnv{l-Ln) for some B G Bn. 

The following theorem, a 'counting' analogue of Theorem 16. 2[ is our main stability result. 
A simple version of it, applicable to 3-uniform hypergraphs with A2('Hn) = 0(1), was proved 
in [1] and used in [H [2] to count sum- free subsets in Abelian groups and in the set [n]. 

Theorem 6.3. Let Ti be a sequence of k-uniform hypergraphs, let a G (0, 1), and let c and 
c' be positive constants. Let p be a sequence of probabilities such that p^^e{l-Ln) ^ c'v(T-Ln) 
and, for every £ E [k — 1], 





and let B be a sequence of sets with Bn C P(V^('H„)). 
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IfH is {a, B)-stable, then the following holds. For every positive 6, there exist e and C 
such that if Cpn^iT-in), then there are at most 



l-e] 



m 



independent sets I G X(l-Ln.,m) such that |/ \ -B| ^ 6m for every B G i3„. 

Proof of Theorem \6.2[ The proof is similar to the proof of Theorem 15.21 Let /c G N, a G 
(0, 1), p G [0, 1]^, and "H and B be as in the statement of Theorem 16. 21 Furthermore, suppose 
that H is (a, i3)-stable and fix some small positive 6. Let e be a small positive constant, let 
n be sufficiently large, let 5' = 5/3 and e' = 3e, and set 

T= [AC ViUn) ■■ \A\ ^ (a - e')v{Hn) and \A\B\^ S'v{Hn) for every 5 G 

Since "H is (a, i3)-stable, it follows that Tin is (J-", e)-dense, provided that e is sufficiently 
small. Let C = C^^k,e,c,c'). By Theorem 12. 2 ^ there exist a family S C (<(^!^*'^"^ ^) and 
functions f : S ^ J-' and g: I(T-Ln) — ^ <S such that 

giI)CI and I \ gil) C f{g(I)) 

for every / G XiTin). Let C = C /e^ and assume that g„ ^ Cpn- Let m = (a — e)qnv{'H„) 
and, for the sake of brevity, let us write V = ViTin) and q = qn- Let 

X' = {/ G X('H„) : |J| ^ m and |/ \ fi| ^ for every B E B^} 

and let A denote the event that T-LnlVq] contains an independent set / G X'. We are required 
to prove that P(^) tends to as n — )■ oo. 
Observe first that 

P(^) ^ ^ P(/ C V; for some / G X' such that g{I) = . (27) 

ses 

Fix an 5* G iS, let Z'g = {I E I' : g{I) = S}, and note for future reference that 

\S\^C'pnv{Hn)^e\v{nn). (28) 

We claim that 

fI^ICV, for some / G X^^) ^ P(5 C Vg) ■ exp _ (29) 

In order to prove ( l29l) . recall that since /(S*) G J-", we either have |/(5')| < {a — e')v{'Hn) or 
1/(5) \ i?| < 5'v{l-in) for some i? G We therefore consider two cases. 

Case 1: |/(5)| < (a-e')^(H„). 

We bound the left-hand side of fl29l) as follows: 



P(/ C for some / G X^) ^ P(5 C l^) ■P(|v;n/(5)| ^ m - . (30) 

In order to justify the above inequality, note that for every / G X^, we have I \ S C f{S). 
Recall that e' = 3e. Since m — \S\ ^ {a — 2e)qv{'Hn), by (125]) . and 

E[|r, n f{S)\] ^ (a - eOg^CHn) = (« - 3£)gt;CH0, 
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then by Chernoff 's inequality we have 

P (IK, n f{S)\ ^m-\S\)^ exp (^_£V^^ . (31) 



Combining (l30l) and (l3Tll . we obtain (1291) . as required. 
Case 2: \B\< 5'v{Hn) for some 5 e 

We estimate the left-hand side of (!29|) as follows: 

P(/ C Kg for some / G X^) ^ P(5 C Vq) ■ fi^V^ n \B)\^ SqviUn) - \S\ 

This follows from the definition of X' and the fact that I \ S f{S) for every / G X^. Since 
1/(5) < 6'v(nn), we have 

E[|y,n(/(5)\s)|] <(5V(?/„), 

whereas 5qv{l-Ln) — 25'qv[l-Ln) by (!28|l and since 5 = 36'. By Chernoff 's inequality, it 

follows that 

F (In n (/(S) \ B)\ > 3SW - |S|) « exp < exp 

since £ was chosen sufficiently small. Thus fl2^ follows in this C3iSG clS well. 
Finally, note that, since 15*1 ^ e^qv{l-Ln) for every 5 G 5, as in (Elj), we have 



(32) 



SG5 s=0 

Putting (EZD, ([29D, and (132]) together, we obtain 

P(^) ^J2^{SC Vq) exp ('-fV^'j ^ expi-e^qvi-Hn)), 

as required. □ 

Proof of Theorem \6.3[ Let G N, a G (0, 1), p G [0, 1]^, and "H and B be as in the statement 
of Theorem 16. 3[ Furthermore, suppose that "H is (a, i3)-stable and fix some positive S. Let 
6' be a sufficiently small positive constant (depending only on a and 6), let e be a small 
positive constant, and let n be sufficiently large. Let e' = 2e, and set 

T={AC ViUn) ■■ \A\ ^ (a - e')v{Hn) and \A\B\^ 5'v{Hn) for every B G 

Since 7{ is (a, i3)-stable, it follows that 1-Ln is ( J-", £:)-dense, provided that e is sufficiently 



small (as a function of 5'). Let C" = Cj^^fc, e, c, c'). By Theorem 12. 2[ there exist a family 
5 C -)) and functions / : 5 — )■ J-" and (7: X('H„) — )■ 5 such that 

^7(/)C/ and /\(7(/)C/(^7(/)) 
for every / G X(7{„). Let C = C"/e^, assume that m ^ CpnviT-Ln), and set 
X' = {/ G X(H„, m) : |/ \ 5| ^ 5m for every G 
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Our task is to bound the size of X' from above. To this end, fix an 5* G 5 and let = {/ G 
X': g{I) = S}. Note for future reference that 

\S\ ^ C'pnviHn) ^ e^m. (33) 

Since f{S) G -F, we either have < (a — e')v{'Hn) or \ i?| < 5'v{l-Ln) for some 

B E Bn- We therefore consider two cases. 

Case 1: \f{S)\ < {a-e')v{Hn)- 

We claim that in this case 

<y-n)\ ^ {I - er f av{Un)\ 

1^1 m )■ ^^^^ 



To prove (IMl) . we first estimate the size of X'g as follows: 



m — 15*1/ \ m — \S\ 

The above inequality follows since I \ S f{S) for every / G X^. Observe that if m ^ 
(a — e')v(T-Ln), then Z' C and hence X' = 0, since T-Ln is (J^, £:)-dense. Therefore, from now 
on let us assume that m < (a — e')v{'Hn) = (a — 2e)v{'Hn)- It follows from ( ITB]) and (ITB]) . as 
in 021]), that 

m-\S\ J ^ {2ev{'Hn)) \ a J { m 
and hence, since \S\ ^ e'^m and e' = 2e, 

as claimed. 

Case 2: \ 5| < (5't;('H„) for some B G 

We claim that in this case 



To prove fl35l) . we first estimate the size of X^ as follows: 

\ om — |d I y \m — omy \om — |d |y \m — om/ 

To see the first inequality, recall that every / G X^ contains at least Sm — \S\ elements of 
f{S)\B for every B G Recall that \S\ ^ e^m and note that therefore, if m ^ {a/2)v{'Hn), 
then 6m — \S\ ^ 5'v{l-Ln) and hence X^ = 0. Thus, we may assume that m < {a/2)v{'Hn). 
It follows, using (ITB]) and (IT7|) . that 



m — 5mj \v{l-in) — rn J \ m J \v{'Hn) J \ct J \ 
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Hence, by f l33|) . fl36|) . and f l37j) . using flMj) . we have 



\S\ 28' J \ 8 J \0! J \ m J \ m 

as claimed, since \S\ ^ e^m and 5' and e were chosen to be sufficiently small. Indeed, note 
that (for this calculation, and assuming that 5 is sufficiently small) 5' = 5^ ■ {5a/2eY/^ and 
e < 5' suffice. 

Finally, by flM|) and fl35l) . we obtain 

\A = E Pii < E — {Psi : isi = .} « (1 - . 

as claimed. □ 

7. Turan's problem in random graphs 

In this section, we shall deduce from Theorems 15.21 and 16.21 the sparse random analogues 
of the classical theorems of Erdos and Stone [IBJ and Turan [01] and of Erdos and Si- 
monovits [HEn], Theorems 11.31 and 11.41 In fact, we will prove a natural generalization of 
Theorem 11.31 to t-balanced t-uniform hypergraphs. Theorem 17.21 below, which was already 
proved by Conlon and Gowers [T2] and Schacht [58] . We first recall the following generaliza- 
tion of the notion of 2-density of a graph to t-uniform hypergraphs. 

Definition 7.1. Let if be a t-uniform hypergraph with at least t + 1 vertices. We define 
the t-density of H, denoted by mt{H), by 

mt(H) = max | ^[^ \ ' ^ : H' C H with v(H') + 1 

Moreover, we say that H is t-balanced if mt{H') ^ mt{H) for all H' C H . 

We also recall that the Turdn density of a t-uniform hypergraph if, denoted iriH), is 
defined by 

7r{H) = hm ^7;^, (38) 

where, as usual, ex [K^, ii) is the Turan number for H, that is, the maximum number of 
edges in an ii-free t-uniform hypergraph with n vertices. 

Theorem 7.2. For every t-balanced t-uniform hypergraph H with A(ii) ^ 2 and every 
positive S, there exists a positive constant C such that if qn ^ Cn~^^'^*^^\ then 



P (^ex (G*(n, g„), H) ^ (7r(ii) + 5)g„ (^^ ^ 



1 



as n 00. 
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Once again, we emphasize that we actually obtain essentially optimal bounds on the 
probability in the above statement, i.e., bounds of the form 1 — exp(— 6g„r;,*) for some positive 
constant b that depends only on H and 6. 

Similarly as in [58j and [H] , Theorems 17.21 and II. 4^ and hence also Theorem II. 3^ will 
follow easily from our general transference results. Theorems 15.21 and 16. 2[ and the classical 
supersaturation results of Erdos and Simonovits [17] (for Theorem I7.2p and the stability 
theorem of Erdos and Simonovits [HI |60] together with the so-called graph removal lemma 
(for Theorem II. 4p . We only need to check that the hypergraph of copies of H in the complete 
hypergraph Kf^, to which we would like to apply our transference theorems, satisfies the 
assumptions of Theorems 15.21 and 16.21 Since we are going to use this fact several times in 
this and later sections, we state it as a separate proposition. Let H be an arbitrary t-uniform 
hypergraph. The hypergraph of copies of H in is the e(if)-uniform hypergraph on the 
vertex set E{Kl^ whose edges are the edges sets of all copies of H in K*. 

Proposition 7.3. Let n and t he integers with t ^ 2 and let H be a t-balanced t-uniform 
hypergraph. Set k = e{H) and let be the k-uniform hypergraph of copies of H in K^. 
There exist positive constants c and d such that, letting p = 77,-1/™* (-f^)^ the following holds: 

(a) p^-^e{H) ^ c'viU). 

(b) AeCH) ^ c ■ min for every i e [k - 1]. 

Proof To prove (|aj), note that v{H) = (") = 0(n*) and that e{n) = y^^Yi ' L{h)) = 
Q^^ii(H)^ Since H is t-balanced, then 

1/ fu\ v{H)-t 



and thus 



pe(H)-l n-^''^^)-*) .^ = ^ (39) 

v{H) n* 

for some positive constant c', as required. 

To prove (jb]), observe first that, for each £ G [A; — 1], 

^ c" ■ max ^^n''^H)-v{H') . jj' c H with e{H') = 

for some positive constant c". Note that for every H' C H, we have mt{H') ^ mt{H) by our 
assumption that H is t-balanced and e(if') — 1 ^ mt{H') ■ {v{H') — t) by the definition of 
t-density. Moreover, p'^i^^n'"^^^ = pri^, by the choice of p. Thus, 

/ ^e{H) v{H) \ / l-e{H') 

M-H) ■ p<-^-' ^ c" . max ,^ „ L.l... = ■ -ax / ^ 



H'QH: e{H')=e \p<H')n^(^'^ J H'CH: e(H')=l \^n^(^')-* 

' (^^l/mt(H)^^^t{H')-(v{H')-t) 

H'CH: e{H')=e \ n^(H')-t 



^ ■ ,„.,?^ax,. . ^ ziTWnz;. < (40) 
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since mt{H') ^ mt{H) and v{H') ^ t for every non-empty H' C if. Moreover, since 
e{H)/v{'H) ^ c' ■ n<^^-\ by ((aj), then 



max 



6(7/) /^'CH: e(H')=^ ype{H')-l^v(H') 



C" /pl-e{H')\ 

^ — ■ max — TFTTT-r ^ — , 

C' H"CH: eiH')=l yn"^^ J c' 

where the last inequahty follows as in (HOj) . Finally, we let c = max{c"/c', c"}. □ 

Proof of Theorem \7.^ Let ii be a t-balanced t-uniform hypergraph, let k = e{H), and let 
{'Hn)nm be the sequence of fc-uniform hypergraphs of copies of H in K*. Let a = iriH), let 
5 be a positive constant, and let p„ = fi~^^'^*^^\ It follows easily from the supersaturation 
theorem of Erdos and Simonovits [TTj that "H is a-dense, see ^SSj- Let C = G^^^ 'H, 6) and 
assume that g„ ^ Cp„ = Cn^^^'^*^^\ Note that the assumption that H contains a vertex of 
degree at least 2 implies that mt{H) > 1/t and hence qnv{l-Ln) — ?• oo as n — )■ oo. Together 
with Proposition 17. 3[ this implies that % satisfies the assumptions of Theorem 15.21 and hence 
with probability tending to 1 as ?7, — >■ oo. 



ex(G'*(n,g„),if) =«(-H4E(G*(n,g„))]) < {7i{H) + 5)q^ 



as required. □ 

In the proof of Theorems 11.41 and II. 6^ we shall need the following proposition, which is a 
fairly straightforward consequence of the Erdos-Simonovits stability theorem [T31 ED] and the 
graph removal lemma [12]. A proof of this statement can be found in [SJ]. We remark that 
a new proof of the graph removal lemma, which avoids the use of the Szemeredi regularity 
lemma, was given recently by Fox 



Proposition 7.4. Let H he an arbitrary graph. For every positive 5, there exists a positive 
e such that the following holds for every n G N. If G is an n-vertex graph with 

1 \ fn^ 



e{G) ^ 1 



x{H) - 1 ; V2 

then either G may be made {x{H) — l)-partite by removing from it at most dri^ edges or G 
contains at least en"^^^ copies of H. 

Proof of Theorem l.^. Let if be a 2-balanced graph, let k = e{H), and let {'H„)nen be the 
sequence of fc-uniform hypergraphs of copies of H in Kn- Let a = vr(ii) = ^1 — ^^p^pyj, 

let 5 be a positive constant, and let pn = n~^^'^^^^\ Moreover, let i3„ be the family of all 
complete {x{H) — l)-partite subgraphs of Kn- By Proposition 17.4^ Ti is (a, i3)-stable. Let 
G = C jg^ ('H, 6), let e = qg^ ('H, 6), and assume that g„ ^ Gpn = Gn~^/"^'^^^\ Note that 
the assumption that H contains a vertex of degree at least 2 implies that m2{H) > 1/2 and 
hence qnv{'Hn) — c>o as n — )■ oo. Together with Proposition I7.3[ the discussion above implies 
that "H satisfies the assumptions of Theorem 16.21 and hence with probability tending to 1 
as — )■ oo, every ii-free subgraph of G' C G{n,qn) satisfies \G' \ B\ ^ dq^vil-in) for some 
B E Bn- In other words, with probability tending to 1 as n — )■ oo, every ii-free subgraph 
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of G{n,qn) can be made {x{H) — l)-partite by removing from it at most 5qn{^ edges, as 
required. □ 

8. The typical structure of if-FREE GRAPHS 

In this section, we shall deduce from Theorems 15.41 and 16.31 the sparse analogue of the 
theorem of Erdos, Frankl, and Rodl [12], Theorem II. 5 [ and an approximate sparse analogue 
of the result of Erdos, Kleitman, and Rothschild [TB], Theorem 11.61 We stress once again 
that neither proof employs Szemeredi's regularity lemma. In order to prove Theorem II. 5 [ we 
are actually going to prove the following natural generalization of it to t-balanced t-uniform 
hypergraphs. Generalizing the definition stated in Section II. 3[ given integers n and m with 
^ m ^ (") and a t- uniform hypergraph if, let us denote by fn,m{H) the number of H-hee 
t-uniform hypergraphs on the vertex set [n] that have exactly m edges. 

Theorem 8.1. Let H be a t-balanced t-uniform hypergraph and let 5 he a positive constant. 
There exists a constant C such that for every n, ifm^ C'n*-^/'^*(^), then 

m J ' \ m 

We remark that Theorem 18. II refines a result of Nagle, Rodl, and Schacht [47j, who, using 
the hypergraph regularity lemma, generalized (jl]) to t-uniform hypergraphs. 

Proof of Theorem \8 . 1\ Let H he a. t-balanced t-uniform hypergraph, let k = e{H), and let 
{'Hn)nm be the sequence of /c-uniform hypergraphs of copies of H in K^. Let a = n{H), 
see (155]1 . let 5 be a positive constant, and let p„ = n-^/'^tiH)^ j^. fQiiQ-^g easily from the 
supersaturation theorem of Erdos and Simonovits [T7] that "H is a-dense, see [5B|. Let 
C = G^^'H,5) and assume that m ^ C'n*-^/""*^-^) ^ CpnviUn). Note that Proposition O 
implies that % satisfies the assumptions of Theorem 15.41 and hence 

\ m J \ m 

as required. The claimed lower bound on fn,m{H) is trivial. □ 

Proof of Theorem M.bX Let if be a 2-balanced graph, let k = e{H), and let ('H„)„gN be the 

sequence of fc-uniform hypergraphs of copies of H in i^„. Let a = '^{H) = ^1 — ^j^^py)' 

let 5 be a positive constant, and let p„ = n~^^^'^'^^\ Moreover, let i3„ be the family of 
all complete {x{H) — l)-partite subgraphs of Kn- By Proposition 17. 4[ "H is (a, i3)-stable. 
Let C = q^n,5), let e = £^n,S), and assume that m ^ (:7^2-i/m2(//) ^ Cpnv{Hn). 
Together with Proposition I7.3[ this implies that "H satisfies the assumptions of Theorem 16.31 
and hence, letting f^^^iH) denote the number of ii-free graphs on the vertex set [n] that 
have exactly m edges and that are not (5, xiH) — l)-partite. 



TC 



fl^{H)<:{l-e] 

\ m 

Finally, note that (trivially). 



fn,m{,H^ ^ 
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and hence f^„^{H) = o(^fn,m{H)), as claimed. □ 

For t-uniform hypergraphs there is no general stability theorem known; however, such 
results have been proved for a few specific hypergraphs (see [221 ESI EHl EE]), and in each 
case we obtain a corresponding result for sparse hypergraphs. For example, following [7], 
let F5 denote the '3-uniform triangle', i.e., the hypergraph with edge set isomorphic to 
{123, 124,345}, and say that a 3-uniform hypergraph is triangle-free if it contains no copy 
of F5. The following theorem follows easily, as above, from Theorem 16.31 combined with the 
hypergraph removal lemma of Gowers [31] and Rodl and Skokan [53] and the stability theorem 
for 3-uniform triangle- free hypergraphs, which was proved by Keevash and Mubayi [53] . 

Theorem 8.2. For every positive 6, there exists a constant C such that the following holds. 
If m ^ Cn^ , then almost every triangle-free 3-uniform hypergraph with n vertices and m 
edges can he made tripartite by removing from it at most 6m edges. 

Proof. Let {'Hn)nm be the sequence of 3-uniform hypergraphs of copies of F5 in K^, set 
a = 2/9, and let i3„ denote the collection of all complete tripartite subhypergraphs of K!^. 
By the hypergraph removal lemma [531 Theorem 1.3], combined with the stability theorem 
for triangle-free 3-uniform hypergraphs [351 Theorem 1.6], it follows that is (a, S)-dense. 
Note that F5 is 3-balanced (though not strictly balanced); it follows by Proposition 17.31 that 
H satisfies the conditions of Theorem 16.31 with p„ = n~^. Hence the number of triangle- 
free 3-uniform hypergraphs with n vertices and m edges that cannot be made tripartite by 
removing at most 6m edges is at most 

\ m J 

which easily implies the theorem. □ 

Finally, we remark that Theorem 18.21 can be seen as an approximate sparse analogue of 
a result of Balogh and Mubayi [7], who used the hypergraph regularity lemma and [351 
Theorem 1.6] to show that almost all triangle- free 3-uniform hypergraphs are tripartite. For 
similar results for other forbidden hypergraphs, see [8] and [49] . 

9. The KLR Conjecture 

In this section, we shall deduce from Theorem 12.21 the KLR conjecture for 2-balanced 
graphs. Theorem 11.81 As in the preceding sections, the proof will be a fairly straight- 
forward application of Theorem 12.21 to an appropriately defined hypergraph 7i and family 
C V{V{T-L)). Let H be an arbitrary 2-balanced graph and let T-L be the e(if)-uniform 
hypergraph of canonical copies of H in the complete blow-up of H. Defining an appropriate 
family J-" and showing that H is (J-", e)-dense will require some work. 

Given a graph H and integers rii, . . . , n^(^H), let us denote by Q{H] ni, . . . , ny(^H)) the col- 
lection of all graphs G constructed in the following way. The vertex set of G is a dis- 
joint union V^i U . . . U Vu{h) of sets of sizes ni, . . . , 'n.„(//), respectively, one for each ver- 
tex of H. The only edges of G lie between those pairs of sets {Vi,Vj) such that {i,j} is 
an edge of H. Recall the definition of Q{H,n,m,p,e) from Section [1.41 and observe that 
Q {H, n, m, p,e) C Q (iJ; n, . . . ,n) for all m, p, and e. 
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The following lemma, which is a robust version of the embedding lemma, stated in Sec- 
tion 11.4^ suggests the right choice of T . We remark that the lemma is well-known, but for 
completeness we provide a short proof of it in the form we shall use. 

Lemma 9.1. Let H he an arbitrary graph and let 5: (0,1] (0,1) he an arbitrary func- 
tion. There exist positive constants a^, ^, and N such that for every collection of integers 
ni, . . . , n^[H) satisfying ni, . . . , n^[H) ^ N and every graph G G Q{H] ni, . . . , n^[H)), one of 
the following holds: 

(a) G contains at least ^ni . . .n^i^H) canonical copies of H . 

(h) There exist a positive constant a with a ^ a^, an edge {«, j} G E{H), and sets Ai C Vi, 
Aj C Vj such that \Ai\ ^ ani, \Aj\ ^ anj, and dG{Ai,Aj) < 6{a). 

Proof. We prove the lemma by induction on the number of vertices of H. If v{H) = 1, then 
(jaj) holds vacuously with ^ = 1 for every choice of G. Let us then assume that v{H) ^ 2, let 
Vi be the first vertex of H (i.e., the vertex corresponding to the set Vi from the definition of 
the family of blow-ups of H), set H = H — vi, and let ai = Given a function 6, define 

S by letting S{x) = S[S{ai) ■ x) for each x G [0, 1] and let ao, ^, and N be the constants 
obtained by invoking the inductive assumption with H replaced by H and 5 replaced by 6. 
Furthermore, let 

ao = mm \ai, ao ■ S(ai)} , N=— — -, and ^ = ai ■ SiaiY^^^'^ ■ i. 

d{ai) 

Now, let ni, . . . ,n^(H) be integers satisfying ni, . . . ,nv{H) ^ N and let G be an arbitrary 
graph from Q{H;ni, . . . ,ny(^H))- Suppose first that for some vj G Nh{vi), the set Wij 
defined by 

Wij = {w eVi: deg(.{w,Vj) < 5{ai)nj} 

contains at least aiUi vertices. In this case, it is not hard to see that (jb]) is satisfied with 
a = «!, i = 1, Ai = Wij, and Aj = Vj. Hence, since ai = we may assume that the set 
Wi defined by 

Wi = Vi\ [j Wij = {weVi: degciw, Vj) ^ S{ai)nj for all vj G Nh{vi)} 

contains at least aiUi vertices. For each w G Wi, let G^ be the subgraph of G induced by 
the set V2{w) U . . . U K)(_h")('U^), where for each j G {2, . . . ,v{H)}, 



VjnNciw) iivjeNnivi) 
Vj iivj^Nniv,) 



Note that the assumption that w G Wi implies that |V^(w)| ^ 6{ai)nj ^ for each j. 
Hence, we may apply the induction hypothesis to each graph Gw 

Suppose first that for some w G Wi, we obtain an a with a ^ ao, vertices i,j G V{H), 
and a pair A^ C Vi{w) C Vi, Aj C Vj{w) C Vj such that 

\Ae\ ^ a|V^(w)| ^ a6{ai)ne for both i G {i,j} 
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and dG{Ai,Aj) = dc^iAi, Aj) < 6{a) = 6(^a6{ai)). Then we are done, since (jb]) is satisfied 
with a = a6{ai). Otherwise, for each w G Wi, the number of canonical copies of H in the 
graph is at least ^|V2(w)| . . . which is at least ^S{aiy^^^~^n2 . . . n^[H)- Since 

w extends each of those copies to a canonical copy of H in G, it follows that in this case, 
the number of canonical copies of if in G is at least \Wi\ ■ ^5{ai)'"'^^'>~^n2 . . . n^[H)-i which is 
at least ^ni . . . n.u(H), as required. □ 

Our next lemma is more straightforward. It allows us to count (£:,p)-regular subgraphs 
of a graph that has a 'hole', as in Lemma [9.1lfb|) . Recall that Q{K2,n,m,p,e) denotes the 



collection of all (e,p)-regular bipartite graphs with m edges and n vertices in each part. 
Given such G, let Vi{G) and V2{G) denote the two parts. For each f3 G (0,1), define a 
function 6: (0, 1] — )■ (0, 1) by setting 

1 

^(^) = ^ (41) 



4e V2, 

for each x G (0, 1]. The following lemma says that a graph G that has a hole of size an and 
density at most 6 (a) has very few subgraphs in Q{K2, 71,771,171/71^,6). 

Lemma 9.2. For every positive ao and P, there exists a positive constant e such that the 
following holds. Let G C K^^n be such that there exist subsets A C Vi(G') and B C V2{G) 
with 

v[vm.{\A\,\B\} ^ an and dG{A,B)<5{a) 
for some a G [a^, 1]. Then, for every m with ^ m ^ n"^ , there are at most 

\m 

subgraphs of G that belong to Q{K2,n,m,m/n'^,e). 

Proof. We begin by noting that, by choosing random subsets of A and B if necessary, we may 
assume that \A\ = \B\ = an. Set e = minjao, 1/2}, write Q* for the family of all subgraphs 
of G that belong to Q{K2,n,m,m/n^ ,e), and let G G Q* . In particular, G is (£:,p)-regular, 
where p = m/n^ and since e ^ a, it follows that the pair {A, B) must have density at least 
(1 — e)p in G. Thus, writing eQ{A,B) for the number of edges of G that lie between the 
sets A and B, since dQ^A, B) < 5{a), then the number of choices for G can be estimated as 
follows: 

l^{l-e)p\A\\B\ \ / \ / ^ / \ / 

Note that the right-hand side of (142|) is zero if m > 25{a)n'^, so we may assume that 
m < 2(5(a)n2 ^ n'^ /2. Thus, using ([H]) and ([16]), <^ implies that 

i^-u E E (^)'O. («) 
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Since 6{a) < l/4e, the summand in the right-hand side of (j43l) is decreasing in i on 
oo) and hence 



\m J \m J 



as required, since 

We can now easily deduce Theorem 11.81 from Theorem I2.2[ 

Proof of Theorem \1.8\ . Let if be a fixed 2-balanced graph, let n G N, and let H{n) be the 
largest graph in the family Q{H;n, . . . ,n), i.e., the complete blow-up of H, where each 
vertex of H is replaced by an independent set of size n and each edge of H is replaced by 
the complete bipartite graph Kn^n- Let "H be the e(if)-uniform hypergraph on the vertex set 
E{H{n)) whose edges are all ?t,^(^) canonical copies of H in H{n). 

Fix an arbitrary positive constant /3, let 5: (0, 1] — )■ (0, 1) be the function defined in (HTl) 
with (3 replaced by /3^/4, i.e., set 

1 (li''-^'"'' 



for each x G (0,1], and let = (ao) [Qj] (ij", 5), ^ = ^Qj] (ff, S), and = A^^if, 5). Let 
J-" be the family of all subgraphs of H{n), i.e., graphs in Q{H;n, . . . ,n), for which (jb]) in 
Lemma [9. II is not satisfied. Clearly J-" is an upset, and so, by Lemma [9. H "H is (J-", ^)-dense 
provided that n ^ N. 

Now, since H is contained in the hypergraph of all copies of H in the complete graph on 
v{H)n vertices and contains a positive proportion of those copies, it follows from Proposi- 
tion [7]3] that "H satisfies the assumptions of Theorem 12.21 with p = fi'^-'^/^2(H) ^^^^ e = for 
some constants c and c' depending only on H. Therefore, there is a constant C, a family 
<S C (^(^,^2?/7i2(ij)), and functions f : S ^ F and g: S such that 

giI)CI and I \ gil) C f(g{I)) 

for every / G X('H). 

Let e be a sufficiently small positive constant such that, in particular, e ^ s^^cto, /3^/4), 
let C = C'/e, and suppose that m ^ Cn^~^^"^^^^\ Let Q* = Q*{H,n,m,m/n^,e) and note 
that Q* C X('H). We are required to bound from above the number of graphs in Q*. 

To this end, fix an S" G S, let 

g* = {Geg*:g{G) = S}, 

and let Gs = f{S). For each {u,w} G E{H), let s{u,w) = esiVuyVw) and note that 
E{„,u,}GS(H)^(«'^) = l-^l- Since 

then s{u,v) ^ em for every ui; G E{H). 

Now, since Gs' G J-", it follows that there exist an a G [ao, 1], an edge {i,j} G E{H), and 
sets Ai C Vi, C Vj such that \Aj\ ^ an and (iG's(A,^j) < S{a). By Lemma [9l2| it 
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follows that there are at most 



al \ m—s(i,j) / 2 



2 



choices for the edges between Vi and Vj such that (^[l^j, V^] G Q{K2,n,m^m/n ,e) and 
GfKj, Vj] C ^^[l^, V^]. Since m — s{i,j) ^ m/2, it follows immediately that 

Summing over sets S E S, and using f lT^ and f|T6|) . we obtain 

'^'^2^1(2) n L2 - m) (771)^(2) (m) S 

'^x - /^2yW fe{H)n'\ [2my /n^V^^) ^ [2e ■ e{H)m 



2 / V m / V s / \ ) \2/ 



Now, since e was chosen to be sufficiently small, it follows that the summand above is 
increasing in s on (0, em] and hence 

as required. □ 
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